Modeling conformance to thematic concepts

ABSTRACT

Embodiments are directed to managing data using network computers. A data graph may be provided based on knowledge graphs and information provided by data sources. Concepts and entities may be provided based on the data graph. Scoring models may be determined based on the concepts and the entities. Thematic scores for the entities may be generated based on the scoring models and the data graph such that the thematic scores include values that quantify each relationship between the concepts and the entities and such that an entity with a higher thematic score value for a concept has a relationship strength value that exceeds another relationship strength value for another entity with a lower thematic score value for the concept. A report that includes the thematic scores, the entities, and the concepts may be provided.

TECHNICAL FIELD

The present invention relates generally to data management, and moreparticularly, but not exclusively, to modeling organization conformanceto thematic concepts.

BACKGROUND

The ever-increasing amount of available information associated with themessaging, behavior, performance, or the like, provides businessanalysts an enormous amount of information to compare or analyzes variesentities, such as, businesses or organizations. In many cases,information from many public or private sources may be easily availablefor many different organizations. Accordingly, the available energy mayprovide near limitless opportunity to research the interests oractivities of various organizations. Likewise, this information enablesdeep or complex analysis of markets or industries and their constituentorganizations. However, the volume of available information and thenumerous sources of information may require significant time or effortfor analysts to digest or otherwise understand. In many cases, it may bedifficult for individual or groups of analysts to identify relevantinformation, let alone reading and understanding it. Accordingly,analysts are often required to rely on personal knowledge or experienceto guide their research or analysis rather than obtaining or reviewingmuch of the available information. In some cases, analysts may rely oninstinct or hunches because they lack the time or resources to reviewthe nearly limitless supply of information that may be continuouslygenerated. Thus, it is with respect to these considerations and othersthat the present invention has been made.

BRIEF DESCRIPTION OF THE DRAWINGS

Non-limiting and non-exhaustive embodiments of the present innovationsare described with reference to the following drawings. In the drawings,like reference numerals refer to like parts throughout the variousfigures unless otherwise specified. For a better understanding of thedescribed innovations, reference will be made to the following DetailedDescription of Various Embodiments, which is to be read in associationwith the accompanying drawings, wherein:

FIG. 1 illustrates a system environment in which various embodiments maybe implemented;

FIG. 2 illustrates a schematic embodiment of a client computer;

FIG. 3 illustrates a schematic embodiment of a network computer;

FIG. 4 illustrates a logical architecture of a portion of a portfolioplatform for associating thematic concepts and organizations inaccordance with one or more of the various embodiments;

FIG. 5 illustrates a logical schematic of a portion of a concept graphfor associating thematic concepts and organizations in accordance withone or more of the various embodiments;

FIG. 6 illustrates a logical schematic of a portion of a concept graphfor associating thematic concepts and organizations in accordance withone or more of the various embodiments;

FIG. 7 illustrates a logical schematic of a portion of an entity graphfor associating thematic concepts and organizations in accordance withone or more of the various embodiments;

FIG. 8 illustrates a logical schematic of a portion of a data graph forassociating thematic concepts and organizations in accordance with oneor more of the various embodiments;

FIG. 9 illustrates an overview flowchart for a process for associatingthematic concepts and organizations in accordance with one or more ofthe various embodiments;

FIG. 10 illustrates a flowchart for a process for associating thematicconcepts and organizations in accordance with one or more of the variousembodiments;

FIG. 11 illustrates a flowchart for a process for associating thematicconcepts and organizations in accordance with one or more of the variousembodiments;

FIG. 12 illustrates a flowchart for a process for associating thematicconcepts and organizations in accordance with one or more of the variousembodiments;

FIG. 13 illustrates a flowchart for a process for associating thematicconcepts and organizations in accordance with one or more of the variousembodiments;

FIG. 14 illustrates a logical schematic of a portion of a system formodeling conformance to thematic concepts in accordance with one or moreof the various embodiments;

FIG. 15 illustrates a logical schematic of a portion of a scoring systemfor modeling conformance to thematic concepts in accordance with one ormore of the various embodiments;

FIG. 16 illustrates a logical schematic of a portion of a scoring systemfor modeling conformance to thematic concepts in accordance with one ormore of the various embodiments;

FIG. 17 illustrates a logical representation of a portion of a userinterface for modeling conformance to thematic concepts in accordancewith one or more of the various embodiments;

FIG. 18 illustrates an overview flowchart for a process for modelingconformance to thematic concepts in accordance with one or more of thevarious embodiments;

FIG. 19 illustrates a flowchart for a process for determining relatedconcepts that may be associated with a theme for modeling conformance tothematic concepts in accordance with one or more of the variousembodiments; and

FIG. 20 illustrates a flowchart for a process for modeling conformanceto thematic concepts in accordance with one or more of the variousembodiments.

DETAILED DESCRIPTION OF VARIOUS EMBODIMENTS

Various embodiments now will be described more fully hereinafter withreference to the accompanying drawings, which form a part hereof, andwhich show, by way of illustration, specific exemplary embodiments bywhich the invention may be practiced. The embodiments may, however, beembodied in many different forms and should not be construed as limitedto the embodiments set forth herein; rather, these embodiments areprovided so that this disclosure will be thorough and complete, and willfully convey the scope of the embodiments to those skilled in the art.Among other things, the various embodiments may be methods, systems,media or devices. Accordingly, the various embodiments may take the formof an entirely hardware embodiment, an entirely software embodiment oran embodiment combining software and hardware aspects. The followingdetailed description is, therefore, not to be taken in a limiting sense.

Throughout the specification and claims, the following terms take themeanings explicitly associated herein, unless the context clearlydictates otherwise. The phrase “in one embodiment” as used herein doesnot necessarily refer to the same embodiment, though it may.Furthermore, the phrase “in another embodiment” as used herein does notnecessarily refer to a different embodiment, although it may. Thus, asdescribed below, various embodiments may be readily combined, withoutdeparting from the scope or spirit of the invention.

In addition, as used herein, the term “or” is an inclusive “or”operator, and is equivalent to the term “and/or,” unless the contextclearly dictates otherwise. The term “based on” is not exclusive andallows for being based on additional factors not described, unless thecontext clearly dictates otherwise. In addition, throughout thespecification, the meaning of “a,” “an,” and “the” include pluralreferences. The meaning of “in” includes “in” and “on.”

For example, embodiments, the following terms are also used hereinaccording to the corresponding meaning, unless the context clearlydictates otherwise.

As used herein the term, “engine” refers to logic embodied in hardwareor software instructions, which can be written in a programminglanguage, such as C, C++, Objective-C, COBOL, Java™, PHP, Perl,JavaScript, Ruby, VBScript, Microsoft .NET™ languages such as C#, or thelike. An engine may be compiled into executable programs or written ininterpreted programming languages. Software engines may be callable fromother engines or from themselves. Engines described herein refer to oneor more logical modules that can be merged with other engines orapplications, or can be divided into sub-engines. The engines can bestored in non-transitory computer-readable medium or computer storagedevice and be stored on and executed by one or more general purposecomputers, thus creating a special purpose computer configured toprovide the engine.

As used herein the term “data source” refers to a service, system, orfacility that may provide data to a data ingestion platform. Datasources may be local (e.g., on premises databases, reachable via a localarea network, or the like) or remote (e.g., reachable over a wide-areanetwork, remote endpoints, or the like). In some cases, data sources maybe streams that provide continuous or intermittent flows of data to adata ingestion platform. Further, in some cases, data sources may belocal or remote file systems, document management systems, cloud-basedstorage, or the like. Data sources may support one or more conventionalor customer communication or data transfer protocols, such as, TCP/IP,HTTP, FTP, SFTP, SCP, RTP, or the like. In some cases, data sources maybe owned, managed, or operated by various organizations that may providedata to a data ingestion platform. In some instances, data sources maybe public or private websites or other public or private repositoriesthat enable third parties to access hosted content.

As used herein the term “raw data source” refers to a data source thatgenerally provides its data as is, or otherwise with little coordinationwith a data ingestion platform. In most cases, raw data sources providedata that may require additional parsing or processing before it isusable by a portfolio platform.

As used herein the term “raw data” refers to data provided by a raw datasource. Raw data may include structured or unstructured data, documents,streams, or the like. Provided data may be considered as raw because thedata source may provide the data in a form or format “as-is.”

As used herein the terms “organization,” or “entity” refer to thevarious businesses, companies, associations, institutions, partnerships,states, agencies, or the like, that may be analyzed or evaluated basedon thematic scores, or the like.

As used herein the term “concept graph” refers to one or more datastructures or data models that include objects that may representconcepts and their respective relationships. Concept graphs may be basedon or represent one or more ontologies. Ontologies or taxonomiesrepresented in concept graphs may be pre-defined, custom, or portions ofexisting ontologies or taxonomies, or combinations thereof. Ontologiesor taxonomies may be determined by one or more of subject matter expertsand/or machine language processing of information from one or more datasources. Also, each instance of a concept may be determined by one ormore associations of information from one or more data sources, such asa field of a data object, a portion of a document, a row of a databasetable, or the like. Further, concept graphs represent the logicalorganization or relationships of concepts. Each node of a concept graphmay be associated with one or more other concepts.

As used herein the term “entity graph” refers to one or more datastructures or data models that include objects that may represententities or organizations and their respective relationships with otherentities or organizations. Each node of an entity graph may beassociated with one or more different entities or organizations. Also,nodes in entity graphs may be associated with various attributes of eachrepresented entity or organization. Each node in an entity graph may beconsidered to represent individual instances of entities ororganizations rather than classes or types of entities or organizations.

As used herein the term “data graph” refers to one or more datastructures or data models that include objects that may represent asynthesis of information from concept graphs and organization graphs.

As used herein the term “theme” refers to a high-level concept thatencompasses one or more lower-level concepts. In some cases, themes mayrefer to areas of technology, industry domains, social structures, orthe like. Examples of themes may include 5G telecommunications,petroleum, green energy, sustainability, medicine, elder care, or thelike. In some cases, a concept may be associated with more than onetheme.

As used herein the term “compound theme,” or “meta-theme” refers to atheme comprised of one or more other themes. For example, a compoundtheme of environmental protections may include sub-themes of solarpower, wind power, green buildings, wildlife conservation, or the like.

As used herein the term “query” refers to commands or sequences ofcommands that are used for querying, searching or retrieving data from amodeling system. Queries generally produce a result or results dependingon the form and structure of the particular query string. Graph QueryLanguage (GraphQL) is a well-known query language often used to formqueries for graph-based databases. However, the various embodiments arenot limited to using GraphQL-like formatting for query strings.Accordingly, other well-known query languages or custom query languagesmay be employed.

As used herein the term “ingestion model” refers one or more datastructures that encapsulate the data, rules, machine learning models,machine learning classifiers, natural language processing instructions,or instructions that may be employed to match or map informationprovided by data sources to one or more knowledge graphs. Ingestionmodels may include various components, such as, one or more machinelearning based classifiers, heuristics, rules, pattern matching,conditions, or the like, that may be employed to match or mapinformation to one or more knowledge graphs. Different ingestion modelsmay be provided for different categories of information. For example,one ingestion model may be directed to ingesting information included inpress releases while another ingestion model may be directed toingesting information included in formal public disclosures, such as,earning calls, merger notices, or the like.

As used herein the term “portfolio model” refers one or more datastructures that encapsulate the data, rules, machine learning models,machine learning classifiers, or instructions that may be employed todetermine one or more organizations or entities that may be correlatedwith one or more concepts or themes.

As used herein the term “thematic score” refers to one or more datastructures that include data or metadata that quantify how closely oneor more organizations correlate with concepts or themes.

As used herein the term “scoring model” refers one or more datastructures that encapsulate the data, rules, machine learning models,machine learning classifiers, or instructions that may be employed togenerate thematic scores.

As used herein the terms “organization score,” or “entity score” refer ascore that represents how well an organization or entity conforms to theconcepts of a theme. It reflects a portion of the thematic score thatmay be attributed to actions, communications, or the like, associatedwith the organization or entity itself. The criteria for computingorganization scores, as well as their level of contribution to thematicscores may be determined by scoring models used to evaluate theorganizations.

As used herein the term “network score” refers a score that representsthe prominence the concepts of a theme within organizations or entitiesthat may be related to the organization of interest. The criteria forcomputing network scores, as well as their level of contribution to theoverall thematic scores may be determined by scoring models used toevaluate the organization.

As used herein the term “general score” refers a score that mayrepresent the interest or prominence of a theme or concept in the widerpublic or society. The criteria for computing general scores, as well astheir level of contribution to thematic scores may be determined byscoring models used to evaluate the organization.

As used herein the term “configuration information” refers toinformation that may include rule based policies, pattern matching,scripts (e.g., computer readable instructions), or the like, that may beprovided from various sources, including, configuration files,databases, user input, built-in defaults, plug-ins, extensions, or thelike, or combination thereof.

The following briefly describes embodiments of the invention in order toprovide a basic understanding of some aspects of the invention. Thisbrief description is not intended as an extensive overview. It is notintended to identify key or critical elements, or to delineate orotherwise narrow the scope. Its purpose is merely to present someconcepts in a simplified form as a prelude to the more detaileddescription that is presented later.

Briefly stated, various embodiments are directed to managing data usingone or more network computers. In one or more of the variousembodiments, a data graph may be provided based on one or more knowledgegraphs and information provided by one or more data sources.

In one or more of the various embodiments, one or more concepts and oneor more entities may be provided based on the data graph. In one or moreof the various embodiments, providing the one or more entities mayinclude providing one or more portfolios that include the one or moreentities such that each included entity contributes one or more partialthematic scores based on the one or more scoring models.

In one or more of the various embodiments, one or more scoring modelsmay be determined based on the one or more concepts and the one or moreentities.

In one or more of the various embodiments, one or more thematic scoresfor the one or more entities may be generated based on the one or morescoring models and the data graph such that the one or more thematicscores include one or more values that quantify each relationshipbetween the one or more concepts and the one or more entities and suchthat an entity with a higher thematic score value for a concept has arelationship strength value that exceeds another relationship strengthvalue for another entity with a lower thematic score value for theconcept. In one or more of the various embodiments, generating the oneor more thematic scores may include traversing the data graph todetermine the one or more entities that are related to the one or moreconcepts based on one or more relationships represented in the datagraph such that the one or more thematic scores are based on thestrength of relationship between each entity and each concept.

In one or more of the various embodiments, a report that includes theone or more thematic scores, the one or more entities, and the one ormore concepts may be provided.

In one or more of the various embodiments, providing the one or moreconcepts may include: providing a theme based on an anchor concepttheme; and traversing a concept graph to determine the one or moreconcepts based on one or more relationships between the one or moreconcepts and the anchor concept such that each relationship between theanchor concept and the one or more concepts may be associated with eachrelationship strength value that exceeds a threshold value.

In one or more of the various embodiments, generating the one or morethematic scores for the one or more entities may include: determining anentity score for each entity based on the entity and its relationshipwith the one or more concepts; determining a network score for eachentity based on the entity and one or more other entities that arerelated to the one or more entities such that the network score may bebased on one or more relationships between the one or more otherentities and the one or more concepts; determining a general score foreach entity based on one or more of a news report, curated data set,social media feed, periodical article feed, government agencydisclosure, government agency submission, or scientific paper; andemploying the one or more scoring models to combine the entity score,the network score, and general score into one or more thematic scorevalues.

In one or more of the various embodiments, providing the report thatincludes the one or more thematic scores may include displaying a valuethat quantifies a strength of each relationship between the one or moreentities and the one or more concepts.

In one or more of the various embodiments, providing the one or morethematic scores, further comprises: employing the one or more scoringmodels to provide one or more sub-scores for each thematic score basedon a weighting that corresponds to a contribution of the one or moresub-scores to the one or more thematic scores; and generating one ormore score containers that each include one or more values for the oneor more weighted sub-scores that correspond to the one or more entities.

Illustrated Operating Environment

FIG. 1 shows components of one embodiment of an environment in whichembodiments of the invention may be practiced. Not all of the componentsmay be required to practice the invention, and variations in thearrangement and type of the components may be made without departingfrom the spirit or scope of the invention. As shown, system 100 of FIG.1 includes local area networks (LANs)/wide area networks(WANs)-(network) 110, wireless network 108, client computers 102-105,portfolio platform server computer 116, or the like.

At least one embodiment of client computers 102-105 is described in moredetail below in conjunction with FIG. 2. In one embodiment, at leastsome of client computers 102-105 may operate over one or more wired orwireless networks, such as networks 108, or 110. Generally, clientcomputers 102-105 may include virtually any computer capable ofcommunicating over a network to send and receive information, performvarious online activities, offline actions, or the like. In oneembodiment, one or more of client computers 102-105 may be configured tooperate within a business or other entity to perform a variety ofservices for the business or other entity. For example, client computers102-105 may be configured to operate as a web server, firewall, clientapplication, media player, mobile telephone, game console, desktopcomputer, or the like. However, client computers 102-105 are notconstrained to these services and may also be employed, for example, asfor end-user computing in other embodiments. It should be recognizedthat more or less client computers (as shown in FIG. 1) may be includedwithin a system such as described herein, and embodiments are thereforenot constrained by the number or type of client computers employed.

Computers that may operate as client computer 102 may include computersthat typically connect using a wired or wireless communications mediumsuch as personal computers, multiprocessor systems, microprocessor-basedor programmable electronic devices, network PCs, or the like. In someembodiments, client computers 102-105 may include virtually any portablecomputer capable of connecting to another computer and receivinginformation such as, laptop computer 103, mobile computer 104, tabletcomputers 105, or the like. However, portable computers are not solimited and may also include other portable computers such as cellulartelephones, display pagers, radio frequency (RF) devices, infrared (IR)devices, Personal Digital Assistants (PDAs), handheld computers,wearable computers, integrated devices combining one or more of thepreceding computers, or the like. As such, client computers 102-105typically range widely in terms of capabilities and features. Moreover,client computers 102-105 may access various computing applications,including a browser, or other web-based application.

A web-enabled client computer may include a browser application that isconfigured to send requests and receive responses over the web. Thebrowser application may be configured to receive and display graphics,text, multimedia, and the like, employing virtually any web-basedlanguage. In one embodiment, the browser application is enabled toemploy JavaScript, HyperText Markup Language (HTML), eXtensible MarkupLanguage (XML), JavaScript Object Notation (JSON), Cascading StyleSheets (CSS), or the like, or combination thereof, to display and send amessage. In one embodiment, a user of the client computer may employ thebrowser application to perform various activities over a network(online). However, another application may also be used to performvarious online activities.

Client computers 102-105 also may include at least one other clientapplication that is configured to receive or send content betweenanother computer. The client application may include a capability tosend or receive content, or the like. The client application may furtherprovide information that identifies itself, including a type,capability, name, and the like. In one embodiment, client computers102-105 may uniquely identify themselves through any of a variety ofmechanisms, including an Internet Protocol (IP) address, a phone number,Mobile Identification Number (MIN), an electronic serial number (ESN), aclient certificate, or other device identifier. Such information may beprovided in one or more network packets, or the like, sent between otherclient computers, ingestion platform server computer 116, profilecorrelation server computer 118, or other computers.

Client computers 102-105 may further be configured to include a clientapplication that enables an end-user to log into an end-user accountthat may be managed by another computer, such as ingestion platformserver computer 116, profile correlation server computer 118, or thelike. Such an end-user account, in one non-limiting example, may beconfigured to enable the end-user to manage one or more onlineactivities, including in one non-limiting example, project management,software development, system administration, configuration management,search activities, social networking activities, browse variouswebsites, communicate with other users, or the like. Also, clientcomputers may be arranged to enable users to display reports,interactive user-interfaces, or results provided by portfolio platformserver computer 116, or the like.

Wireless network 108 is configured to couple client computers 103-105and its components with network 110. Wireless network 108 may includeany of a variety of wireless sub-networks that may further overlaystand-alone ad-hoc networks, and the like, to provide aninfrastructure-oriented connection for client computers 103-105. Suchsub-networks may include mesh networks, Wireless LAN (WLAN) networks,cellular networks, and the like. In one embodiment, the system mayinclude more than one wireless network.

Wireless network 108 may further include an autonomous system ofterminals, gateways, routers, and the like connected by wireless radiolinks, and the like. These connectors may be configured to move freelyand randomly and organize themselves arbitrarily, such that the topologyof wireless network 108 may change rapidly.

Wireless network 108 may further employ a plurality of accesstechnologies including 2nd (2G), 3rd (3G), 4th (4G) 5th (5G) generationradio access for cellular systems, WLAN, Wireless Router (WR) mesh, andthe like. Access technologies such as 2G, 3G, 4G, 5G, and future accessnetworks may enable wide area coverage for mobile computers, such asclient computers 103-105 with various degrees of mobility. In onenon-limiting example, wireless network 108 may enable a radio connectionthrough a radio network access such as Global System for Mobilcommunication (GSM), General Packet Radio Services (GPRS), Enhanced DataGSM Environment (EDGE), code division multiple access (CDMA), timedivision multiple access (TDMA), Wideband Code Division Multiple Access(WCDMA), High Speed Downlink Packet Access (HSDPA), Long Term Evolution(LTE), and the like. In essence, wireless network 108 may includevirtually any wireless communication mechanism by which information maytravel between client computers 103-105 and another computer, network, acloud-based network, a cloud instance, or the like.

Network 110 is configured to couple network computers with othercomputers, including, portfolio platform computer 116, client computers102, and client computers 103-105 through wireless network 108, or thelike. Network 110 is enabled to employ any form of computer readablemedia for communicating information from one electronic device toanother. Also, network 110 can include the Internet in addition to localarea networks (LANs), wide area networks (WANs), direct connections,such as through a universal serial bus (USB) port, Ethernet port, otherforms of computer-readable media, or any combination thereof. On aninterconnected set of LANs, including those based on differingarchitectures and protocols, a router acts as a link between LANs,enabling messages to be sent from one to another. In addition,communication links within LANs typically include twisted wire pair orcoaxial cable, while communication links between networks may utilizeanalog telephone lines, full or fractional dedicated digital linesincluding T1, T2, T3, and T4, or other carrier mechanisms including, forexample, E-carriers, Integrated Services Digital Networks (ISDNs),Digital Subscriber Lines (DSLs), wireless links including satellitelinks, or other communications links known to those skilled in the art.Moreover, communication links may further employ any of a variety ofdigital signaling technologies, including without limit, for example,DS-0, DS-1, DS-2, DS-3, DS-4, OC-3, OC-12, OC-48, or the like.Furthermore, remote computers and other related electronic devices couldbe remotely connected to either LANs or WANs via a modem and temporarytelephone link. In one embodiment, network 110 may be configured totransport information of an Internet Protocol (IP).

Additionally, communication media typically embodies computer readableinstructions, data structures, program modules, or other transportmechanism and includes any information non-transitory delivery media ortransitory delivery media. By way of example, communication mediaincludes wired media such as twisted pair, coaxial cable, fiber optics,wave guides, and other wired media and wireless media such as acoustic,RF, infrared, and other wireless media.

Also, one embodiment of portfolio platform server computer 116 isdescribed in more detail below in conjunction with FIG. 3. Although FIG.1 illustrates portfolio platform server computer 116, or the like, as asingle computer, the innovations or embodiments are not so limited. Forexample, one or more functions of portfolio platform server computer116, or the like, may be distributed across one or more distinct networkcomputers. Moreover, in one or more embodiments, portfolio platformserver computer 116 may be implemented using a plurality of networkcomputers. Further, in one or more of the various embodiments, portfolioplatform server computer 116, or the like, may be implemented using oneor more cloud instances in one or more cloud networks. Accordingly,these innovations and embodiments are not to be construed as beinglimited to a single environment, and other configurations, and otherarchitectures are also envisaged.

Illustrative Client Computer

FIG. 2 shows one embodiment of client computer 200 that may include manymore or less components than those shown. Client computer 200 mayrepresent, for example, one or more embodiments of mobile computers orclient computers shown in FIG. 1.

Client computer 200 may include processor 202 in communication withmemory 204 via bus 228. Client computer 200 may also include powersupply 230, network interface 232, audio interface 256, display 250,keypad 252, illuminator 254, video interface 242, input/output interface238, haptic interface 264, global positioning systems (GPS) receiver258, open air gesture interface 260, temperature interface 262,camera(s) 240, projector 246, pointing device interface 266,processor-readable stationary storage device 234, and processor-readableremovable storage device 236. Client computer 200 may optionallycommunicate with a base station (not shown), or directly with anothercomputer. And in one embodiment, although not shown, a gyroscope may beemployed within client computer 200 to measuring or maintaining anorientation of client computer 200.

Power supply 230 may provide power to client computer 200. Arechargeable or non-rechargeable battery may be used to provide power.The power may also be provided by an external power source, such as anAC adapter or a powered docking cradle that supplements or recharges thebattery.

Network interface 232 includes circuitry for coupling client computer200 to one or more networks, and is constructed for use with one or morecommunication protocols and technologies including, but not limited to,protocols and technologies that implement any portion of the OSI modelfor mobile communication (GSM), CDMA, time division multiple access(TDMA), UDP, TCP/IP, SMS, MMS, GPRS, WAP, UWB, WiMax, SIP/RTP, GPRS,EDGE, WCDMA, LTE, UMTS, OFDM, CDMA2000, EV-DO, HSDPA, or any of avariety of other wireless communication protocols. Network interface 232is sometimes known as a transceiver, transceiving device, or networkinterface card (MC).

Audio interface 256 may be arranged to produce and receive audio signalssuch as the sound of a human voice. For example, audio interface 256 maybe coupled to a speaker and microphone (not shown) to enabletelecommunication with others or generate an audio acknowledgment forsome action. A microphone in audio interface 256 can also be used forinput to or control of client computer 200, e.g., using voicerecognition, detecting touch based on sound, and the like.

Display 250 may be a liquid crystal display (LCD), gas plasma,electronic ink, light emitting diode (LED), Organic LED (OLED) or anyother type of light reflective or light transmissive display that can beused with a computer. Display 250 may also include a touch interface 244arranged to receive input from an object such as a stylus or a digitfrom a human hand, and may use resistive, capacitive, surface acousticwave (SAW), infrared, radar, or other technologies to sense touch orgestures.

Projector 246 may be a remote handheld projector or an integratedprojector that is capable of projecting an image on a remote wall or anyother reflective object such as a remote screen.

Video interface 242 may be arranged to capture video images, such as astill photo, a video segment, an infrared video, or the like. Forexample, video interface 242 may be coupled to a digital video camera, aweb-camera, or the like. Video interface 242 may comprise a lens, animage sensor, and other electronics. Image sensors may include acomplementary metal-oxide-semiconductor (CMOS) integrated circuit,charge-coupled device (CCD), or any other integrated circuit for sensinglight.

Keypad 252 may comprise any input device arranged to receive input froma user. For example, keypad 252 may include a push button numeric dial,or a keyboard. Keypad 252 may also include command buttons that areassociated with selecting and sending images.

Illuminator 254 may provide a status indication or provide light.Illuminator 254 may remain active for specific periods of time or inresponse to event messages. For example, when illuminator 254 is active,it may back-light the buttons on keypad 252 and stay on while the clientcomputer is powered. Also, illuminator 254 may back-light these buttonsin various patterns when particular actions are performed, such asdialing another client computer. Illuminator 254 may also cause lightsources positioned within a transparent or translucent case of theclient computer to illuminate in response to actions.

Further, client computer 200 may also comprise hardware security module(HSM) 268 for providing additional tamper resistant safeguards forgenerating, storing or using security/cryptographic information such as,keys, digital certificates, passwords, passphrases, two-factorauthentication information, or the like. In some embodiments, hardwaresecurity module may be employed to support one or more standard publickey infrastructures (PKI), and may be employed to generate, manage, orstore keys pairs, or the like. In some embodiments, HSM 268 may be astand-alone computer, in other cases, HSM 268 may be arranged as ahardware card that may be added to a client computer.

Client computer 200 may also comprise input/output interface 238 forcommunicating with external peripheral devices or other computers suchas other client computers and network computers. The peripheral devicesmay include an audio headset, virtual reality headsets, display screenglasses, remote speaker system, remote speaker and microphone system,and the like. Input/output interface 238 can utilize one or moretechnologies, such as Universal Serial Bus (USB), Infrared, WiFi, WiMax,Bluetooth™, and the like.

Input/output interface 238 may also include one or more sensors fordetermining geolocation information (e.g., GPS), monitoring electricalpower conditions (e.g., voltage sensors, current sensors, frequencysensors, and so on), monitoring weather (e.g., thermostats, barometers,anemometers, humidity detectors, precipitation scales, or the like), orthe like. Sensors may be one or more hardware sensors that collect ormeasure data that is external to client computer 200.

Haptic interface 264 may be arranged to provide tactile feedback to auser of the client computer. For example, the haptic interface 264 maybe employed to vibrate client computer 200 in a particular way whenanother user of a computer is calling. Temperature interface 262 may beused to provide a temperature measurement input or a temperaturechanging output to a user of client computer 200. Open air gestureinterface 260 may sense physical gestures of a user of client computer200, for example, by using single or stereo video cameras, radar, agyroscopic sensor inside a computer held or worn by the user, or thelike. Camera 240 may be used to track physical eye movements of a userof client computer 200.

GPS transceiver 258 can determine the physical coordinates of clientcomputer 200 on the surface of the Earth, which typically outputs alocation as latitude and longitude values. GPS transceiver 258 can alsoemploy other geo-positioning mechanisms, including, but not limited to,triangulation, assisted GPS (AGPS), Enhanced Observed Time Difference(E-OTD), Cell Identifier (CI), Service Area Identifier (SAI), EnhancedTiming Advance (ETA), Base Station Subsystem (BSS), or the like, tofurther determine the physical location of client computer 200 on thesurface of the Earth. It is understood that under different conditions,GPS transceiver 258 can determine a physical location for clientcomputer 200. In one or more embodiments, however, client computer 200may, through other components, provide other information that may beemployed to determine a physical location of the client computer,including for example, a Media Access Control (MAC) address, IP address,and the like.

In at least one of the various embodiments, applications, such as,operating system 206, other client apps 224, web browser 226, or thelike, may be arranged to employ geo-location information to select oneor more localization features, such as, time zones, languages,currencies, calendar formatting, or the like. Localization features maybe used in user-interfaces, reports, as well as internal processes ordatabases. In at least one of the various embodiments, geo-locationinformation used for selecting localization information may be providedby GPS 258. Also, in some embodiments, geolocation information mayinclude information provided using one or more geolocation protocolsover the networks, such as, wireless network 108 or network 111.

Human interface components can be peripheral devices that are physicallyseparate from client computer 200, allowing for remote input or outputto client computer 200. For example, information routed as describedhere through human interface components such as display 250 or keyboard252 can instead be routed through network interface 232 to appropriatehuman interface components located remotely. Examples of human interfaceperipheral components that may be remote include, but are not limitedto, audio devices, pointing devices, keypads, displays, cameras,projectors, and the like. These peripheral components may communicateover networks implemented using WiFi, Bluetooth™, Bluetooth LTE™, andthe like. One non-limiting example of a client computer with suchperipheral human interface components is a wearable computer, whichmight include a remote pico projector along with one or more camerasthat remotely communicate with a separately located client computer tosense a user's gestures toward portions of an image projected by thepico projector onto a reflected surface such as a wall or the user'shand.

A client computer may include web browser application 226 that isconfigured to receive and to send web pages, web-based messages,graphics, text, multimedia, and the like. The client computer's browserapplication may employ virtually any programming language, including awireless application protocol messages (WAP), and the like. In one ormore embodiments, the browser application is enabled to employ HandheldDevice Markup Language (HDML), Wireless Markup Language (WML),WMLScript, JavaScript, Standard Generalized Markup Language (SGML),HyperText Markup Language (HTML), eXtensible Markup Language (XML),HTML5, and the like.

Memory 204 may include RAM, ROM, or other types of memory. Memory 204illustrates an example of computer-readable storage media (devices) forstorage of information such as computer-readable instructions, datastructures, program modules or other data. Memory 204 may store BIOS 208for controlling low-level operation of client computer 200. The memorymay also store operating system 206 for controlling the operation ofclient computer 200. It will be appreciated that this component mayinclude a general-purpose operating system such as a version of UNIX, orLinux®, or a specialized client computer operating system such as iOS,or the like. The operating system may include, or interface with a Javavirtual machine module that enables control of hardware components oroperating system operations via Java application programs.

Memory 204 may further include one or more data storage 210, which canbe utilized by client computer 200 to store, among other things,applications 220 or other data. For example, data storage 210 may alsobe employed to store information that describes various capabilities ofclient computer 200. The information may then be provided to anotherdevice or computer based on any of a variety of methods, including beingsent as part of a header during a communication, sent upon request, orthe like. Data storage 210 may also be employed to store socialnetworking information including address books, buddy lists, aliases,user profile information, or the like. Data storage 210 may furtherinclude program code, data, algorithms, and the like, for use by aprocessor, such as processor 202 to execute and perform actions. In oneembodiment, at least some of data storage 210 might also be stored onanother component of client computer 200, including, but not limited to,non-transitory processor-readable removable storage device 236,processor-readable stationary storage device 234, or even external tothe client computer.

Applications 220 may include computer executable instructions which,when executed by client computer 200, transmit, receive, or otherwiseprocess instructions and data. Applications 220 may include, forexample, other client applications 224, web browser 226, or the like.Client computers may be arranged to exchange communications one or moreservers or other computers.

Other examples of application programs include calendars, searchprograms, email client applications, IM applications, SMS applications,Voice Over Internet Protocol (VOIP) applications, contact managers, taskmanagers, transcoders, database programs, word processing programs,security applications, spreadsheet programs, games, search programs,visualization applications, and so forth.

Additionally, in one or more embodiments (not shown in the figures),client computer 200 may include an embedded logic hardware deviceinstead of a CPU, such as, an Application Specific Integrated Circuit(ASIC), Field Programmable Gate Array (FPGA), Programmable Array Logic(PAL), or the like, or combination thereof. The embedded logic hardwaredevice may directly execute its embedded logic to perform actions. Also,in one or more embodiments (not shown in the figures), client computer200 may include one or more hardware micro-controllers instead of CPUs.In one or more embodiments, the one or more micro-controllers maydirectly execute their own embedded logic to perform actions and accessits own internal memory and its own external Input and Output Interfaces(e.g., hardware pins or wireless transceivers) to perform actions, suchas System On a Chip (SOC), or the like.

Illustrative Network Computer

FIG. 3 shows one embodiment of network computer 300 that may be includedin a system implementing one or more of the various embodiments. Networkcomputer 300 may include many more or less components than those shownin FIG. 3. However, the components shown are sufficient to disclose anillustrative embodiment for practicing these innovations. Networkcomputer 300 may represent, for example, one or more embodiments ofportfolio platform server computer 116, or the like, of FIG. 1.

Network computers, such as, network computer 300 may include a processor302 that may be in communication with a memory 304 via a bus 328. Insome embodiments, processor 302 may be comprised of one or more hardwareprocessors, or one or more processor cores. In some cases, one or moreof the one or more processors may be specialized processors designed toperform one or more specialized actions, such as, those describedherein. Network computer 300 also includes a power supply 330, networkinterface 332, audio interface 356, display 350, keyboard 352,input/output interface 338, processor-readable stationary storage device334, and processor-readable removable storage device 336. Power supply330 provides power to network computer 300.

Network interface 332 includes circuitry for coupling network computer300 to one or more networks, and is constructed for use with one or morecommunication protocols and technologies including, but not limited to,protocols and technologies that implement any portion of the OpenSystems Interconnection model (OSI model), global system for mobilecommunication (GSM), code division multiple access (CDMA), time divisionmultiple access (TDMA), user datagram protocol (UDP), transmissioncontrol protocol/Internet protocol (TCP/IP), Short Message Service(SMS), Multimedia Messaging Service (MMS), general packet radio service(GPRS), WAP, ultra-wide band (UWB), IEEE 802.16 WorldwideInteroperability for Microwave Access (WiMax), Session InitiationProtocol/Real-time Transport Protocol (SIP/RTP), or any of a variety ofother wired and wireless communication protocols. Network interface 332is sometimes known as a transceiver, transceiving device, or networkinterface card (NIC). Network computer 300 may optionally communicatewith a base station (not shown), or directly with another computer.

Audio interface 356 is arranged to produce and receive audio signalssuch as the sound of a human voice. For example, audio interface 356 maybe coupled to a speaker and microphone (not shown) to enabletelecommunication with others or generate an audio acknowledgment forsome action. A microphone in audio interface 356 can also be used forinput to or control of network computer 300, for example, using voicerecognition.

Display 350 may be a liquid crystal display (LCD), gas plasma,electronic ink, light emitting diode (LED), Organic LED (OLED) or anyother type of light reflective or light transmissive display that can beused with a computer. In some embodiments, display 350 may be a handheldprojector or pico projector capable of projecting an image on a wall orother object.

Network computer 300 may also comprise input/output interface 338 forcommunicating with external devices or computers not shown in FIG. 3.Input/output interface 338 can utilize one or more wired or wirelesscommunication technologies, such as USB™, Firewire™, WiFi, WiMax,Thunderbolt™, Infrared, Bluetooth™, Zigbee™, serial port, parallel port,and the like.

Also, input/output interface 338 may also include one or more sensorsfor determining geolocation information (e.g., GPS), monitoringelectrical power conditions (e.g., voltage sensors, current sensors,frequency sensors, and so on), monitoring weather (e.g., thermostats,barometers, anemometers, humidity detectors, precipitation scales, orthe like), or the like. Sensors may be one or more hardware sensors thatcollect or measure data that is external to network computer 300. Humaninterface components can be physically separate from network computer300, allowing for remote input or output to network computer 300. Forexample, information routed as described here through human interfacecomponents such as display 350 or keyboard 352 can instead be routedthrough the network interface 332 to appropriate human interfacecomponents located elsewhere on the network. Human interface componentsinclude any component that allows the computer to take input from, orsend output to, a human user of a computer. Accordingly, pointingdevices such as mice, styluses, track balls, or the like, maycommunicate through pointing device interface 358 to receive user input.

GPS transceiver 340 can determine the physical coordinates of networkcomputer 300 on the surface of the Earth, which typically outputs alocation as latitude and longitude values. GPS transceiver 340 can alsoemploy other geo-positioning mechanisms, including, but not limited to,triangulation, assisted GPS (AGPS), Enhanced Observed Time Difference(E-OTD), Cell Identifier (CI), Service Area Identifier (SAI), EnhancedTiming Advance (ETA), Base Station Subsystem (BSS), or the like, tofurther determine the physical location of network computer 300 on thesurface of the Earth. It is understood that under different conditions,GPS transceiver 340 can determine a physical location for networkcomputer 300. In one or more embodiments, however, network computer 300may, through other components, provide other information that may beemployed to determine a physical location of the client computer,including for example, a Media Access Control (MAC) address, IP address,and the like.

In at least one of the various embodiments, applications, such as,operating system 306, ingestion engine 322, portfolio engine 324,indexing engine 326, scoring engine 327, other services 329, or thelike, may be arranged to employ geo-location information to select oneor more localization features, such as, time zones, languages,currencies, currency formatting, calendar formatting, or the like.Localization features may be used in user interfaces, dashboards,reports, as well as internal processes or databases. In at least one ofthe various embodiments, geo-location information used for selectinglocalization information may be provided by GPS 340. Also, in someembodiments, geolocation information may include information providedusing one or more geolocation protocols over the networks, such as,wireless network 108 or network 111.

Memory 304 may include Random Access Memory (RAM), Read-Only Memory(ROM), or other types of memory. Memory 304 illustrates an example ofcomputer-readable storage media (devices) for storage of informationsuch as computer-readable instructions, data structures, program modulesor other data. Memory 304 stores a basic input/output system (BIOS) 308for controlling low-level operation of network computer 300. The memoryalso stores an operating system 306 for controlling the operation ofnetwork computer 300. It will be appreciated that this component mayinclude a general-purpose operating system such as a version of UNIX, orLinux®, or a specialized operating system such as MicrosoftCorporation's Windows operating system, or the Apple Corporation'smacOS® operating system. The operating system may include, or interfacewith one or more virtual machine modules, such as, a Java virtualmachine module that enables control of hardware components or operatingsystem operations via Java application programs. Likewise, other runtimeenvironments may be included.

Memory 304 may further include one or more data storage 310, which canbe utilized by network computer 300 to store, among other things,applications 320 or other data. For example, data storage 310 may alsobe employed to store information that describes various capabilities ofnetwork computer 300. The information may then be provided to anotherdevice or computer based on any of a variety of methods, including beingsent as part of a header during a communication, sent upon request, orthe like. Data storage 310 may also be employed to store socialnetworking information including address books, friend lists, aliases,user profile information, or the like. Data storage 310 may furtherinclude program code, data, algorithms, and the like, for use by aprocessor, such as processor 302 to execute and perform actions such asthose actions described below. In one embodiment, at least some of datastorage 310 might also be stored on another component of networkcomputer 300, including, but not limited to, non-transitory media insideprocessor-readable removable storage device 336, processor-readablestationary storage device 334, or any other computer-readable storagedevice within network computer 300, or even external to network computer300. Data storage 310 may include, for example, evidence data stores312, knowledge graphs 314, ingestion models 316, portfolio models 318,scoring models 319, or the like.

Applications 320 may include computer executable instructions which,when executed by network computer 300, transmit, receive, or otherwiseprocess messages (e.g., SMS, Multimedia Messaging Service (MMS), InstantMessage (IM), email, or other messages), audio, video, and enabletelecommunication with another user of another mobile computer. Otherexamples of application programs include calendars, search programs,email client applications, IM applications, SMS applications, Voice OverInternet Protocol (VOIP) applications, contact managers, task managers,transcoders, database programs, word processing programs, securityapplications, spreadsheet programs, games, search programs, and soforth. Applications 320 may include ingestion engine 322, portfolioengine 326, indexing engine 326, scoring engine 327, other services 329,or the like, that may be arranged to perform actions for embodimentsdescribed below. In one or more of the various embodiments, one or moreof the applications may be implemented as modules or components ofanother application. Further, in one or more of the various embodiments,applications may be implemented as operating system extensions, modules,plugins, or the like.

Furthermore, in one or more of the various embodiments, ingestion engine322, portfolio engine 324, indexing engine 326, scoring engine 327,other services 329, or the like, may be operative in a cloud-basedcomputing environment. In one or more of the various embodiments, theseapplications, and others, that comprise the portfolio platform may beexecuting within virtual machines or virtual servers that may be managedin a cloud-based based computing environment. In one or more of thevarious embodiments, in this context the applications may flow from onephysical network computer within the cloud-based environment to anotherdepending on performance and scaling considerations automaticallymanaged by the cloud computing environment. Likewise, in one or more ofthe various embodiments, virtual machines or virtual servers dedicatedto ingestion engine 322, portfolio engine 324, indexing engine 326,scoring engine 327, other services 329, or the like, may be provisionedand de-commissioned automatically.

Also, in one or more of the various embodiments, ingestion engine 322,portfolio engine 326, indexing engine 326, scoring engine 327, otherservices 329, or the like, may be located in virtual servers running ina cloud-based computing environment rather than being tied to one ormore specific physical network computers.

Further, network computer 300 may also comprise hardware security module(HSM) 360 for providing additional tamper resistant safeguards forgenerating, storing or using security/cryptographic information such as,keys, digital certificates, passwords, passphrases, two-factorauthentication information, or the like. In some embodiments, hardwaresecurity module may be employed to support one or more standard publickey infrastructures (PKI), and may be employed to generate, manage, orstore keys pairs, or the like. In some embodiments, HSM 360 may be astand-alone network computer, in other cases, HSM 360 may be arranged asa hardware card that may be installed in a network computer.

Additionally, in one or more embodiments (not shown in the figures),network computer 300 may include an embedded logic hardware deviceinstead of a CPU, such as, an Application Specific Integrated Circuit(ASIC), Field Programmable Gate Array (FPGA), Programmable Array Logic(PAL), or the like, or combination thereof. The embedded logic hardwaredevice may directly execute its embedded logic to perform actions. Also,in one or more embodiments (not shown in the figures), the networkcomputer may include one or more hardware microcontrollers instead of aCPU. In one or more embodiments, the one or more microcontrollers maydirectly execute their own embedded logic to perform actions and accesstheir own internal memory and their own external Input and OutputInterfaces (e.g., hardware pins or wireless transceivers) to performactions, such as System On a Chip (SOC), or the like.

Illustrative Logical System Architecture for Data Ingestion

FIG. 4 illustrates a logical architecture of a portion of portfolioplatform 400 for associating thematic concepts and organizations; andmodeling conformance to thematic concepts in accordance with one or moreof the various embodiments. In one or more of the various embodiments,portfolio platforms, such as, portfolio platform 400 may include one ormore sub-systems, including: one or more data sources, such as, datasource 402; one or more ingestion engines, such as, ingestion engine404; one or more ingestion models stored in one or more ingestion modeldata stores, such as, ingestion model data store 406; one or moreknowledge graphs, such as, knowledge graphs 410; one or more portfolio,such as portfolio engine 412; one or more portfolio models stored in oneor more portfolio model data stores, such as, portfolio model data store414; one or more user interfaces or APIs that enable users to provideconcept query information, such as, concept queries 416; one or morequery indexes, such as, query indexes 418; one or more portfolio datastores, such as, portfolio data store 422; one or more indexing engines,such as, indexing engine 424; one or more scoring models, such asscoring models 426; or the like.

In one or more of the various embodiments, portfolio platforms may bearranged to receive or obtain raw data from data source 402. In someembodiments, raw data may be information provided from one or moreprivate or public sources. In some embodiments, data sources may includenews articles, press releases, social media information, governmentfilings/records, court filings, litigation summaries, curated data sets,conference reports, or the like. In some embodiments, raw informationmay generally be text based. However, in some embodiments, one or moreactions for extracting, transforming, or loading (ETL processing) may beperformed by other services to clean up or format the raw informationinto text that may be suitable for additional automated analysis. Forexample, in some embodiments, audio files may be transcribed(automatically or otherwise) to text before providing to a portfolioplatform. Also, in some embodiments, portfolio platforms may be arrangedto include one or more additional (not shown) sub-systems that performETL actions.

In one or more of the various embodiments, data sources may includereal-time streams of information, such as, news feeds, or the like, aswell as periodic bulk transfers of information, such as, annual reports,monthly periodicals, or the like.

In one or more of the various embodiments, ingestion engines, such as,ingestion engine 404 may be arranged to process information from datasources, such as, data source 402. In some embodiments, ingestionengines may be arranged to employ one or more ingestion models, such as,ingestion model 406, to perform various analysis, categorization, orclassification on the incoming information.

In one or more of the various embodiments, ingestion models may be oneor more data structures that encapsulate the data, rules, machinelearning models, machine learning classifiers, natural languageprocessing instructions, or instructions that may be employed to matchor map information provided by data sources to one or more knowledgegraphs. Ingestion models may include various components, such as, one ormore machine learning based classifiers, heuristics, rules, patternmatching, conditions, or the like, that may be employed to match or mapinformation to one or more knowledge graphs. Different ingestion modelsmay be provided for different categories of information. For example,one ingestion model may be directed to ingesting information included inpress releases while another ingestion model may be directed toingesting information included in formal public disclosures, such as,earning calls, merger notices, or the like.

In one or more of the various embodiments, ingestion engines may bearranged to generate or updates one or more knowledge graphs, such as,knowledge graphs 408, based on the applications of the one or moreingestion models. In one or more of the various embodiments, knowledgegraphs 408 may represent one or more different knowledge graphs that maybe arranged to capture different types of information or relationships,such as, concept graphs, entity graphs, data graphs, or the like.

In one or more of the various embodiments, concept graphs may be one ormore data structures or data models that include objects that mayrepresent concepts and their respective relationships. In someembodiments, concept graphs may be based on or represent one or moreontologies. In some embodiments, ontologies or taxonomies represented inconcept graphs may be pre-defined, custom, or portions of existingontologies or taxonomies, or combinations thereof. In some embodiments,ontologies or taxonomies may be created by subject matter experts. Insome embodiments, concept graphs may represent the logical organizationor relationships of concepts. In some embodiments, each node of aconcept graph may be associated with one or more other concepts.

In one or more of the various embodiments, entity graphs may be one ormore data structures or data models that include objects that mayrepresent entities or organizations and their respective relationshipswith other entities or organizations. In some embodiments, each node ofan entity graph may be associated with one or more different entities ororganizations. Also, in some embodiments, nodes in entity graphs may beassociated with various attributes of each represented entity ororganization. In some embodiments, each node in an entity graph may bearranged to represent individual instances of entities or organizationsrather than classes or types of entities or organizations.

In one or more of the various embodiments, data graphs may be one ormore data structures or data models that include objects that mayrepresent a synthesis of information from concept graphs andorganization graphs.

In one or more of the various embodiments, ingestion engines may bearranged to store information provided by data sources in one or moreevidence data stores, such as, evidence data store 410, or the like.Accordingly, in some embodiments, ingestion engines may be arranged toinclude a reference or identifier with relevant nodes in concept graphs,entity graphs, or data graphs that enable the original information thatwas used to generated the node to be viewed by users or other services.

In one or more of the various embodiments, portfolio engines, such as,portfolio engine 412 may be arranged to employ one or more portfoliomodels, such as, portfolio models 414, to identify one or more entitiesbased on their association with one or more concepts or themes.Accordingly, in some embodiments, portfolio engines may be arranged toreceive concept query information and generate result sets include oneor more entities that may be associated with the concepts included inthe concept query information.

In one or more of the various embodiments, portfolio models may be oneor more data structures that encapsulate the data, rules, machinelearning models, machine learning classifiers, or instructions that maybe employed to determine one or more organizations or entities that maybe correlated or otherwise associated with one or more concepts orthemes. In one or more of the various embodiments, portfolio models maybe arranged to include the criteria (e.g., rules, classifiers, or thelike) for determining if entities may be correlated or associated withprovided themes or concepts. In some embodiments, portfolio engines maybe arranged to evaluate one or more knowledge graphs (e.g., conceptgraphs, entity graphs, or data graphs) using the one or more portfoliomodels to determine the entities that may be correlated with the one ormore provided concepts.

Also, in one or more of the various embodiments, portfolio engines maybe arranged to employ one or more query indexes, such as, query indexes418 to rapidly execute searches against knowledge graphs or evidence toidentity candidate entities based on the provided query conceptinformation.

In one or more of the various embodiments, indexing engines, such as,indexing engine 424 may be arranged to periodically process or analyzeevidence in evidence store 410 to update the query indexes. In someembodiments, query indexes 418 may be arranged to include two or moreseparate/partial indexes. Accordingly, in some embodiments, differentindexes may be directed to using different types or keys, bucket sizes,scope, or the like. Also, in some embodiments, two or more indexes maybe arranged hierarchical with respect to each other. For example, forsome embodiments, a first index may include keys corresponding to broadconcepts, while another related second index includes narrower conceptsthat may be related to broader concepts indexes in the first index.

Accordingly, in one or more of the various embodiments, ingestionengines, such as, ingestion engine 424 may be arranged to generateindexes based on the knowledge graphs and the evidence data stores.

In some embodiments, in response to concept query information, portfolioengines, such as, portfolio engine 412 may be arranged to determine oneor more candidate entities, such as, candidate entities 420. In one ormore of the various embodiments, candidate entities may be a list ofentities that may be ranked, grouped, or sorted based their correlationor associated with concepts derived from the concept query information.In one or more of the various embodiments, the listed entities may bereferred to as candidate entities until they are associated with one ormore portfolios, such as, portfolios 422.

In one or more of the various embodiments, portfolios may be datastructures that list or reference a set of entities selected from thecandidate entities. In some embodiments, portfolios may be associatedwith users, such as, individual users may have one or more portfolios.

In one or more of the various embodiments, as described in more detailbelow, portfolio engines may be arranged to perform various actions todetermine candidate entities that may qualify for inclusion in one ormore portfolios. Accordingly, in some embodiments, portfolio platformsmay enable users to identify or classify organizations based on theirmembership in a given portfolio. For example, in some embodiments,organizations that qualify for inclusion in a “green energy” portfoliomay be assumed to some level of correlation with concepts related togreen energy. However, in some embodiments, absent a deep-dive into theunderlying ingestion models, knowledge graphs, portfolio models, and soon, it may be difficult for users to quantify how or why a givenorganization has qualified for inclusion in a particular portfolio.Thus, in some embodiments, portfolio platforms may be arranged togenerate scoring information associated with how well a organizationconforms to one or more concepts or themes.

In some embodiments, portfolio platforms may be arranged to employ oneor more scoring models, such as, scoring models 426 to generate scoringinformation for organizations that enable the level of conformance to aconcept or themes goals to compared across different organizations.Also, in some embodiments, thematic scores may represent a level ofprominence concepts or themes (e.g., multiple related concepts) may havewithin an organization. For example, in some embodiments, if anorganization has a high thematic score for a particular concept, theuser may be informed that the organization considers that conceptimportant or at least meets the criteria to be considered supportive orin conformance with the concept.

In one or more of the various embodiments, scoring models may includethe rules, instructions, formulas, weights, or the like, to computethematic scores based on the knowledge graphs. In some embodiments,different concepts or different themes may require one or more differentrules, instructions, formulas, weights, or the like, to generatemeaningful thematic scores. Accordingly, in some embodiments, scoringmodels may be provided by configuration information to enable new ordifferent scoring models to be added to portfolio platforms. Likewise,in some embodiments, existing scoring models may be modified manually orautomatically based on active or passive user feedback.

FIG. 5 illustrates a logical schematic of a portion of concept graph 500for associating thematic concepts and organizations in accordance withone or more of the various embodiments.

In one or more of the various embodiments, concept graphs may includeone or more nodes that each correspond to a concept. In someembodiments, edges in concept graphs may represent one or morerelationships between the concepts included in the concept graphs.

In one or more of the various embodiments, ingestion engines may bearranged to employ one or more ingestion models to identify theoccurrence of concepts from evidence (information) provided by one ormore data sources. Likewise, in some embodiments, ingestion engines maybe arranged to employ the one or more ingestion models to determinerelationships between concepts. In some embodiments, the scope ordefinition of the relevant relationships may vary depending on problemdomain, industry, or the like. Accordingly, in some embodiments,ingestion engines may be arranged to rely on ingestion models todetermine the specific criteria for determining if two or more conceptsmay be related under the terms defined in the ingestion models. Forexample, for some embodiments, one or more ingestion models may bearranged to define relationships based on one or more taxonomies,ontologies, or the like, that may be determined to be relevant for thetype of concepts being processed. In some embodiments, ingestion enginesor ingestion models may be arranged to employ natural languageprocessing methods, such as, topic finding, or the like, to identifytopics that may be inferred to be topics from the information providedby data sources. Also, in some embodiments, ingestion engines may bearranged to enable one or more users (e.g., domain experts) to definetaxonomies, ontologies, or the like, for specific problem domains.Similarly, in some embodiments, ingestion engines may be arranged toenable one or more users to modify or augment one or more taxonomies orontologies.

Accordingly, in some embodiments, as information from data sources maybe processed by ingestion engines, one or more concepts in theinformation may be extracted based on pre-defined taxonomies orontologies, expert created custom taxonomies or ontologies, orcombination thereof.

In this example, for some embodiments, node 502 may represent a nodethat corresponds to the concept of “artificial intelligence” and node506 may represent a node that corresponds to the concept of “neuralnetworks”. Accordingly, in this example, relationship 504 may representthe one or more relationships between “artificial intelligence” node 502and “neural networks” node 506.

In general, for some embodiments, relationships between nodes of aknowledge graphs may represent various relationships between the nodes.The particular relationships may vary depending on the type of knowledgegraphs or its purpose.

In some embodiments, as described above, ingestion engines may bearranged to determine relationships between concepts based on taxonomiesor ontologies. In one or more of the various embodiments, the relativestrength of relationships represented in concept graphs may be based onthe frequency that each relationship may be observed in informationprovided from data sources.

In one or more of the various embodiments, ingestion engines may bearranged to generate partial concept graphs for some or all of theevidence that may be ingested. For example, a news report describingartificial intelligence applications to medicine may result in a partialconcept graph similar to concept graph 500. Accordingly, in one or moreof the various embodiments, ingestion engines may be arranged toincrement a counter value for each relationship in concept graph 500that may be included in the new partial concept graph made from incominginformation. Thus, in some embodiments, relationships that are observedmore often than other relationships may have higher count values. Insome embodiments, ingestion engines may be arranged to consider conceptgraph relationships that have higher counts as being stronger than otherrelationships with lower counts.

In one or more of the various embodiments, concept graphs may includeone or more disconnected portions. In some embodiments, disconnectedportions of a concept graph represent unrelated concepts.

Note, in this example, and throughout this discussion, knowledge graphsare illustrated as conventional graphs in two dimensions. One ofordinary skill in the art will appreciate that a variety of datastructures, such as, tables, structures, lists, hashes, arrays, objects,or the like, may be employed to store or implement the nodes orrelationships of knowledge graphs depending on the capabilities of theunderlying data stores or other local requirements or localcircumstances.

FIG. 6 illustrates a logical schematic of a portion of concept graph 600for associating thematic concepts and organizations in accordance withone or more of the various embodiments. In this example, concept graph600 represents a concept graph similar in structure or purpose toconcept graph 500. However, in this example, it includes differentconcepts that are related to green energy rather than artificialintelligence. For brevity and clarity additional description of conceptgraph 600 is omitted as it may be considered similar in function toconcept graph 500 described above.

In one or more of the various embodiments, portfolio engines or scoringengines may be arranged to determine one or more concepts that may berelated to an anchor concept. In some embodiments, an anchor concept maybe a concept identified in a query or otherwise selected by a user. Insome embodiments, portfolio engines or scoring engines may be arrangedto determine or provide one or more other concepts that may be relatedto an anchor concept.

Accordingly, in one or more of the various embodiments, portfolioengines or scoring engines may be arranged to provide one or morerelated concepts based on traversing a concept graph to determine one ormore concepts based on one or more relationships between the one or moreconcepts and the anchor concept such that each relationship between theanchor concept and the one or more concepts may be associated with arelationship strength values that exceed a threshold value. In someembodiments, themes may be determined based on one or more anchorconcepts and the one or more of the concepts determined to be related tothe anchor concept. In this example, for some embodiments, concept 602may be considered an anchor concept.

FIG. 7 illustrates a logical schematic of a portion of entity graph 700for associating thematic concepts and organizations in accordance withone or more of the various embodiments.

In one or more of the various embodiments, entity graphs may include oneor more nodes that each correspond to an entity, such as, a company, anorganization, or the like. In some embodiments, edges in entity graphsmay represent one or more relationships between the entities included inthe entity graphs.

In one or more of the various embodiments, ingestion engines may bearranged to employ one or more ingestion models to determine entitiesfrom evidence information provided by one or more data sources.Accordingly, in some embodiments, one or more ingestion models may bearranged to determine entities from company names, trademarks, products,or the like, that may be included in evidence information.

In some embodiments, ingestion models may be arranged to employ naturallanguage processing to determine relationships between the observedentities based on various metrics such as word distances, frequency ofco-appearance, or the like.

Also, in one or more of the various embodiments, ingestion engines maybe arranged to employ one or more ingestion models that are arranged todetect specific features that may be associated with one or moredifferent types of relationships between related entities. Accordingly,in some embodiments, one or more ingestion models may be directed toevaluate one or more features, such as, geographic location, boardmembers, suppliers, customers, litigation activity, markets/problemdomains, financial performance, property records, employees, growthplans, employment activity (e.g., head count, announced layoffs, jobads, or the like), litigation activity, or the like. In someembodiments, ingestion engines may be arranged to determine ingestionmodels from configuration information to account for local requirementsor local circumstances.

In some embodiments, ingestion models may include rules or instructionsthat validate or compare facts discovered from evidence information withone or more external services or databases, such as, corporateregistries, trademark registries, property records, or the like.

In this example, for some embodiments, entity node 702 and entity node704 may be considered to represent entities. In this example, theentities are companies, however, these innovations are not so limited,other entity graphs may represent different types of entities or thesame entity graph may include different types of entities.

Also, in this example, for some embodiments, relationship 706 representsone or more relationships between Company A (node 702) and Company D(node 704). In this example, for brevity and clarity each relationshipin entity graph 700 may represent a bundle of relationships representingvarious types of relationships between entities that may be discoveredor observed. In some embodiments, different entities in the same entitygraph may be related by different types of relationships. Alternatively,in some embodiments, ingestion engines may be arranged to generatedifferent or separate entity graphs for one or more differentrelationship types. For example, an entity graph that representsrelationships based on geography may be separate from another entitygraph that represents relationships based on customers.

In one or more of the various embodiments, ingestion engines may bearranged to employ the one or more ingestion models to determine thestrength of the different relationships. Also, in one or more of thevarious embodiments, ingestion engines may be arranged to weightdifferent types of relationships differently based on configurationinformation to account for local requirements or local circumstances.

FIG. 8 illustrates a logical schematic of a portion of data graph 800for associating thematic concepts and organizations in accordance withone or more of the various embodiments.

In one or more of the various embodiments, data graphs may include oneor more nodes that each correspond to an entity, concept, or the like.In some embodiments, edges in data graphs may represent one or morerelationships between the entities included in the data graphs or theconcepts included in the data graphs. Accordingly, in some embodiments,ingestion engines may be arranged to generate data graphs based on asynthesis of concept graphs and entity graphs.

Thus, in this example, data graph 800 includes concepts from conceptgraph 600 and entities from entity graph 700. For example, data graph800 include solar power (node 802) and machine learning (node 806) whichcould be from concept graph 600 and company A (node 808) and company D(node 810) which could be from entity graph 700.

However, in one or more of the various embodiments, data graphs mayinclude one or more relationships that do not exist in concept graphs orentity graphs. In this example, for some embodiments, relationship 804represents a relationship between the concepts solar power and machinelearning that does not exist in concept graph 700. Here, in thisexample, the two concepts may be considered related because of one ormore relationships between company A (node 808) and company D (node810), or the like.

Generalized Operations for Data Ingestion

FIGS. 9-13 represent generalized operations for associating thematicconcepts and organizations in accordance with one or more of the variousembodiments. In one or more of the various embodiments, processes 900,1000, 1100, 1200, and 1300 described in conjunction with FIGS. 9-13 maybe implemented by or executed by one or more processors on a singlenetwork computer, such as network computer 300 of FIG. 3. In otherembodiments, these processes, or portions thereof, may be implemented byor executed on a plurality of network computers, such as networkcomputer 300 of FIG. 3. In yet other embodiments, these processes, orportions thereof, may be implemented by or executed on one or morevirtualized computers, such as, those in a cloud-based environment.However, embodiments are not so limited and various combinations ofnetwork computers, client computers, or the like may be utilized.Further, in one or more of the various embodiments, the processesdescribed in conjunction with FIGS. 9-13 may perform actions forassociating thematic concepts and organizations in accordance with atleast one of the various embodiments or architectures such as thosedescribed in conjunction with FIGS. 4-8. Further, in one or more of thevarious embodiments, some or all of the actions performed by processes900, 1000, 1100, 1200, and 1300 may be executed in part by ingestionengine 322, portfolio engine 324, indexing engine 326, scoring engine327, or the like.

FIG. 9 illustrates an overview flowchart for process 900 for associatingthematic concepts and organizations in accordance with one or more ofthe various embodiments. After a start block, at block 902, in one ormore of the various embodiments, ingestion engines may be arranged toingest raw information from one or more data sources. As describedabove, in some embodiments, ingestion engines may be arranged to connectwith one or more data sources that may provide raw information. In someembodiments, raw information may include live/real-time streams, suchas, news feeds, financial activity information, consumer sentimentinformation, market reports, or the like. In some embodiments, one ormore data sources may periodically provide raw information, such as,private/public newsletters, private/public reports, or the like. Also,in some embodiments, one or more data sources may provide one or morecurated lists, collections, or databases of information. These mayinclude reports, summaries, predictions, forecasts, or the like, curatedby one or more industry/topic experts.

At block 904, in one or more of the various embodiments, ingestionengines may be arranged to generate one or more knowledge graphs fromthe raw information based on one or more ingestion models. In one ormore of the various embodiments, ingestion engines may be arranged toemploy one or more ingestion models to generate one or more knowledgegraphs from the information provided by data sources.

In some embodiments, ingestion engines may be arranged to generateconcept graphs, one or more entity graphs, or one or more data graphs.In some embodiments, concept graphs may represent concepts andrelationships between concepts. In some embodiments, entity graphs maybe arranged to represent entities and their relationships with eachbased on one or more characteristics of the entities. In someembodiments, entities may be organizations, companies, governmentagencies, educational institutions, or the like.

At block 906, in one or more of the various embodiments, ingestionengines may be arranged to store or archive the raw information in oneor more evidence data stores.

In one or more of the various embodiments, while ingestion engines maybe arranged to execute a variety of actions such as transforming,indexing, cleaning, formatting, or the like, on the ingestedinformation, the original raw information may be captured or preservedin an archival data source. In some embodiments, the archivedinformation may be associated with one or more nodes or relationships inone or more knowledge graphs. Thus, in some embodiments, users may beenabled to drill down to source documents from portions of the variousknowledge graphs.

Likewise, in some embodiments, source documents may be linked to queryresults or portfolio to provide support regarding why one or moreentities may be included in query results or portfolios. Accordingly, insome embodiments, portfolio engines may be arranged to provide one ormore interactive reports that enable users to browse or otherwise accessthe underlying raw information that may be responsible for queryresults. Similarly, in some embodiments, entities associated withportfolios may be linked to raw information exhibits that may beresponsible for indicating a given entities should be included in aportfolio.

At block 908, in one or more of the various embodiments, ingestionengines may be arranged to evaluate one or more portfolios based on theone or more knowledge graphs.

In one or more of the various embodiments, one or more portfolios may beassociated with one or more concepts, or themes (e.g., collections ofone or more concepts). Likewise, in some embodiments, portfolios may beassociated with additional criteria associated with the concepts, suchas strength-of-relationship threshold values, balancing rules (e.g.,share of portfolio associated with particular concepts or themes may bedefined), inclusion/exclusion rules, or the like.

Also, in some embodiments, portfolios may be associated with one or morecriteria based on one or more characteristics of the entities, such as,employee count, geographic location, entity type, market size, entityvaluation, product type, ownership, leadership, or the like.

Accordingly, in some embodiments, portfolio engines may be arranged tomonitor entities or portfolios to identify one or more candidateentities that may be recommended for inclusion in existing portfolios.Likewise, in some embodiments, portfolio engines may be arranged tomonitor entities or portfolios to determine one or more entities thatmay be recommended for removal from one or more portfolios.

Also, in some embodiments, portfolio engines may be arranged to monitorportfolios to determine if one or more portfolios have drifted out ofcompliance with one or more of the criteria defining the portfolio. Forexample, if a portfolio is configured to include 30% of entities withstrong associations with machine learning and 70% of entities withstrong associations with green energy, portfolio engines may be arrangedto periodically evaluate if the entities in the portfolio conform to thedesired balance.

At block 910, in one or more of the various embodiments, ingestionengines may be arranged to generate one or more reports or notificationsregarding the evaluation of the one or more portfolios.

In one or more of the various embodiments, such reports may include oneor more entities that may be recommended for inclusion with exclusionfrom one or more portfolios. Likewise, in some embodiments, reports mayinclude one or more portfolios that fail to conform with one or morecriteria used to establish the portfolios. For example, a portfolio thatis configured to exclude members that are closely associated Company Xmay be flagged for review if one or more members establishes a violatingrelationship with Company X.

Next, in one or more of the various embodiments, control may be returnedto a calling process.

FIG. 10 illustrates a flowchart for process 1000 for associatingthematic concepts and organizations in accordance with one or more ofthe various embodiments. After a start block, at block 1002, in one ormore of the various embodiments, ingestion engines may be arranged toingest raw information from one or more data sources. See, rawinformation ingestion discussed above.

At block 1004, in one or more of the various embodiments, ingestionengines may be arranged to determine one or more concepts from theinformation based one or more ingestion models.

In one or more of the various embodiments, ingestion engines may bearranged to employ one or more ingestion models to determine concepts orconcept relationships from raw information.

In some embodiments, ingestion models may be arranged to employ one ormore heuristics, machine learning classifiers, natural languageprocessing (NLP), or the like, to determine concepts in a given portionof raw information.

In one or more of the various embodiments, ingestion engines may bearranged to employ one or more parsers, grammars, decoders, or the like,to extract natural language text from the raw information. Likewise, insome embodiments, one or more ingestion models may be arranged toinclude specific heuristics, parsers, grammars, classifiers, patternmatchers, or the like, for processing raw information. Thus, in someembodiments, if a new type of raw information may be added to theportfolio platform, one or more ingestion models may be added to processthe new type of raw information to identify concepts.

In one or more of the various embodiments, ingestion engines may bearranged to employ one or more taxonomies or ontologies that identifyone or more concepts and the one or more text words that may beassociated with a given topic. In some embodiments, portfolio platformmay be arranged to enable users to modify taxonomies, or the like, toinclude/exclude one or more concepts from consideration. Likewise, insome embodiments, users may be enabled to modify the text words that maybe associated with a given concept.

At block 1006, in one or more of the various embodiments, ingestionengines may be arranged to generate one or more partial concept graphsbased on the raw information and the one or more ingestion models.

In one or more of the various embodiments, individual portions of rawinformation may be processed separately to identify concepts or conceptrelationships included in each portion of raw information. Accordingly,in some embodiments, while the concepts may be derived from centraltaxonomies, a partial concept graph may be generated individually forsome or all portions of raw information. Thus, for example, if the rawinformation under consideration is a two page document, the ingestionengine may produce a partial concept graph that includes the concepts orconcept relationships produced from natural language included in the twopage document.

In one or more of the various embodiments, partial concept graphs mayinclude strength of relationship values for the various conceptrelationships determined from each portion of raw information.

At block 1008, in one or more of the various embodiments, ingestionengines may be arranged to add the one or more partial concept graphs toa main concept graph. In some embodiments, ingestion engines may bearranged to generate one or more central concept graphs for eachportfolio platform. In some embodiments, portfolio platforms may bemulti-tenant systems such that two or more organizations share the sameportfolio platform service. Also, in some embodiments, organizations maybe enabled to operate separate or private portfolio platforms that maybe hosted in private cloud environments, on-premises servers, or acombination thereof.

In some embodiments, portfolio platforms configured for multi-tenantoperation may be arranged to keep knowledge graphs of different tenantsisolated or quarantined. Alternatively, in some embodiments, centralconcept graphs may be shared across multiple tenants.

In one or more of the various embodiments, ingestion engines may bearranged to add the one or more partial concept graphs determine fromraw information into one or more central concept graphs. In someembodiments, central concept graphs may be comprised of concepts orconcept relationships discovered in raw information. Accordingly, insome embodiments, the one or more partial concept graphs determined fromingested raw information may be added to the one or more central conceptgraphs.

In one or more of the various embodiments, if the partial concept graphsinclude concepts that may be missing or absent from central conceptgraphs, those concepts may be added to central concept graphs.Similarly, in some embodiments, concept relationships that may be absentfrom central concept graphs may be added to central concept graphs.

Accordingly, in some embodiments, overtime as raw information may becontinuously processed, ingestion engines may be arranged tocontinuously add partial concept graphs to central concept graphs.

At block 1010, in one or more of the various embodiments, ingestionengines may be arranged to update the strength of one or morerelationships in the concept graph based on the added partial conceptgraphs.

In one or more of the various embodiments, ingestion engines may bearranged to keep a count of the number times concepts or conceptrelationships may be determined from raw information. Accordingly, insome embodiments, nodes in concept graphs may be associated with acounter value that may be incremented as duplicate concepts may bediscovered. Likewise, in some embodiments, edges in concept graphs maybe associated with another counter value that may be incremented asconcept relationships are observed.

Accordingly, in some embodiments, as concepts or concept relationshipsmay be added to concept graphs, ingestion engines may be arranged toupdate a strength of relationship score that may be associated withconcept relationship.

Thus, in some embodiments, the number of times a particular conceptrelationship may be determined from raw information may be employed as aterm for determining the strength of relationship score. In someembodiments, ingestion engines may be arranged to employ additionalterms or expressions, such as, time decay, source weights, content typeweights, or the like, that may be considered for computing strength ofrelationship scores.

In some embodiments, ingestion engines may be arranged to enabledifferent organizations to define or modify one or more parameters orexpressions associated with determining strength of relationship scores.For example, organization A may value information from source A morethan source B. Accordingly, in this example, organization A may beenabled to configure one or more parameter values that may be employedto reduce the impact on strength of relationship scores of partialconcept graphs associated with source B as compared to source A.

Accordingly, in some embodiments, ingestion engines may be arranged todetermine rules, expressions, or the like, for determining strength ofrelationship scores from configuration information to account for localcircumstances or local requirements.

Next, in one or more of the various embodiments, control may be returnedto a calling process.

FIG. 11 illustrates a flowchart for process 1100 for associatingthematic concepts and organizations in accordance with one or more ofthe various embodiments. After a start block, at block 1102, in one ormore of the various embodiments, as described above, ingestion enginesmay be arranged to ingest raw information from one or more data sources.

At block 1104, in one or more of the various embodiments, ingestionengines may be arranged to determine entity information based on the rawinformation and one or more ingestion models. In one or more of thevarious embodiments, entity information may be considered information orevidence that may assert or infer one or more facts associated withentities.

In some embodiments, ingestion engines may be arranged to ingestionmodels that may be tailored to determining entity information. In someembodiments, ingestion models may be arranged to include regularexpressions, classifiers, or the like, that can determine one or morespecific types of evidence associated with various entities. Also, insome embodiments, ingestion engines may be arranged to employ ingestionmodels that include or access one or more databases of entityinformation that may be employed to identify or confirm entityinformation. For example, in some embodiments, one or more ingestionmodels may be arranged to compare words from raw information to databaseof corporations, trademarks, product names, or the like, to determinenames of entities that may be referenced in the ingested information.

In one or more of the various embodiments, ingestion engines may bearranged to determine a variety of different types of entity attributesfrom the ingested information. In some embodiments, attributes mayinclude one or more of, geographic location, product types, market size,industry, number of employees, number of offices/locations, form ofentity (e.g., corporation, partnership, limited liability company,association, or the like), type of entity, (e.g., for-profit business,non-profit business, educational institution, governmentagency/department, or the like), number of employees, board members,officers, partnerships, trademarks, patents/patent applications, or thelike.

In one or more of the various embodiments, one or more ingestion modelsmay be tailored for discovering or confirming one or more of theattributes. In some embodiments, different organizations may be more orless interested in some attributes. Accordingly, in some embodiments,different organizations may require or provide one or more ingestionmodels that may be arranged to identify one or more particularattributes that other organizations may be uninterested in.

In one or more of the various embodiments, ingestion engines may bearranged to enable organizations (e.g., users) to selectively determinesome or all of the entity attributes they may be interested in. In someembodiments, ingestion engines may be arranged to employ one or moreexternal or remote services for obtaining or confirming one or moreentity attributes. In some embodiments, ingestion engines may bearranged to provide entity names to one or more external services thatmay provide a variety of attributes associated with the provided entitynames. For example, for some embodiments, ingestion engines may bearranged to submit entity names to a government financial reportingdatabase to collect various attributes that may be collected andpublished by the government.

In some embodiments, one or more ingestion models may be arranged toinclude natural language processing that identifies candidate evidence(e.g., words, phrases) that may be verified based on one or moreentity-related databases. For example, candidate corporation names maybe checked against relevant corporate registries. Likewise, for example,ingestion models may be arranged to determine particular entityattributes, such as, board members, officers, investors, key employees,or the like, by searching one or more entity-related databases using theevidence found in the raw information.

At block 1106, ingestion engines may be arranged to generate one or morepartial entity graphs based on the entity information and the one ormore ingestion models. Similar to how partial concept graphs may begenerated from portions of raw information, in some embodiments, partialentity graphs may be generated from portions of raw information.

In one or more of the various embodiments, ingestion engines may bearranged to employ one or more ingestion models that may be configuredto determine or evaluate various relationships between entities. In someembodiments, entities may be associated with a variety of attributessome or all of which may infer or assert one or more relationships withother entities.

Accordingly, in some embodiments, relationships between entities may bebased on bundle of individual relationships that may be individuallyconsidered or evaluated. For example, in some embodiments, entitieslocated in the same town may be related based on geographic location;entities that share board members may be related based on the sharedboard members; entities that are engaging in litigation may be relatedbecause of shared litigation; entities that sell the same product orcompete in the same market may be considered related, and so on.

In one or more of the various embodiments, one or more ingestion modelsmay be specialized to identify specific relationships between entities.

In some embodiments, one or more ingestion models may be arranged todetermine one or more relationships between entities based oninformation from third party of external sources rather than exclusivelyform raw information provided by data sources. For example, for someembodiments, an ingestion model may be configured to employ geographicinformation associated with an entity to lookup other entities that maybe related based on geographic location by submitting addressinformation discovered in raw information to one or more databases.Thus, in some embodiments, information used to determine entityrelationships may be obtained from source outside of the provide rawinformation.

At block 1108, in one or more of the various embodiments, ingestionengines may be arranged to add the one or more partial entity graphs toa main entity graph.

In some embodiments, similar to how partial concept graphs may be addedto central concept graphs, partial entity graphs may be added to one ormore central entity graphs.

However, in one or more of the various embodiments, ingestion enginesmay be arranged to enable strength of relationship scores for entitiesto be computed differently depending on the particular types ofrelationships as well as the number of the time same relationship isobserved.

For example, in some embodiments, entities related because one entity isa wholly owned subsidiary of the other may be considered to have astrength of relationship score that exceeds two entities related becausethey are located in the same building.

Further, in one or more of the various embodiments, ingestion enginesmay be arranged to compute different strength of relationship scores fordifferent relationships based on the different attributes two relatedentities. Accordingly, in some embodiments, relationships based ondifferent attributes may be considered different dimensions of theoverall relationship. In some embodiments, ingestion engines may bearranged to provide combined strength of relationship scores of relatedentities by combining one or more strength of relationship scores thatmay be associated with the one or more dimensions of their relationship.

Next, in one or more of the various embodiments, control may be returnedto a calling process.

FIG. 12 illustrates a flowchart for process 1200 for associatingthematic concepts and organizations in accordance with one or more ofthe various embodiments. After a start block, at block 1202, in one ormore of the various embodiments, ingestion engines may be arranged toprovide a concept graph and an entity graph.

As described above, ingestion engines may be arranged to generate ormaintain one or more concept graphs and one or more entity graphs basedon ingested information, curated information, configuration information,or the like, provided via one or more data sources.

At block 1204, in one or more of the various embodiments, ingestionengines may be arranged to generate a data graph based on the conceptgraph and entity graph.

In one or more of the various embodiments, data graphs may be anotherknowledge graph based on one or more concept graphs and one or moreentity graphs Accordingly, in some embodiments, ingestion engines may bearranged to merge concept graphs with entity graphs to provide datagraphs that include concepts, concept relationships, entities, entityrelationship, or the like.

In one or more of the various embodiments, ingestion engines may bearranged to generate one or more relationships between concepts andentities. In one or more of the various embodiments, ingestion enginesmay be arranged to employ one or more ingestion models, or the like,that may be arranged to execute one or more rules, instructions, or thelike, for generating data graphs from combining concept graphs andentity graphs.

Accordingly, in one or more of the various embodiments, ingestion modelsmay be arranged to relate entities with concepts based on one or moreattributes of the entities. For example, in some embodiments, an entity(e.g., company) that produces solar panels may be associated with theconcept ‘solar power,’ or the like.

Also, in some embodiments, one or more ingestion models may be arrangedto evaluate raw information that may associate one or more concepts withone or more entities based on communications by or about entities. Thus,in some embodiments, press releases expressing an entity's support forvarious initiatives, technologies, social issues, or the like, maydetermine one or more relationships between the entity and one or morerelated concepts.

In one or more of the various embodiments, ingestion engines may bearranged to generate data graphs that may include one or morerelationships that may be omitted from the concept graphs or entitygraphs. In some embodiments, during the process of merging conceptgraphs and entity graphs, ingestion engines may be arranged to determineone or more relationships between concepts based on how the concepts maybe related to one or more entities. For example, a new relationshipbetween concepts may be inferred based on the how the concepts relate toone or more entities. For example, one or more concepts associated withtwo strongly/closely related entities may be considered related based onthe strength of relationship score of the related entities.

Likewise, in some embodiments, one or more entities that may beotherwise unrelated, may be related in a data graph because they bothare related to some of the same or similar concepts.

Accordingly, in some embodiments, portfolio engines may be arranged totraverse data graphs to gather a full multi-dimensional view of entitiesand related entities based on concepts and entity attributes.

At decision block 1206, in one or more of the various embodiments, ifthe concept graph or entity graph may be modified, control may flow toblock 1208; otherwise, control may loop back to decision block 1206.

In one or more of the various embodiments, ingestion engines may bearranged to continuously ingest raw information from one or more datasources. Accordingly, in some embodiments, concept graphs or entitygraphs may be continuously being updated. Thus, in some embodiments, asconcept graphs or entity graphs may be modified, ingestion engines maybe arranged to automatically update one or more associated data graphs.

At block 1208, in one or more of the various embodiments, ingestionengines may be arranged to update the data graph based on the updates.As concept graphs or entity graphs may be updated, ingestion engines maybe arranged to the nodes or relationships in data graphs to reflectchanges that may occur in one or more related concept graphs or entitygraphs.

In some embodiments, ingestion engines may be arranged to periodicallycheck for changes to enable more than one change to be processed inbatches.

At block 1210, in one or more of the various embodiments, optionally,ingestion engines may be arranged to generate time series updates basedon updates to the data graph.

In one or more of the various embodiments, portfolio platforms may bearranged to track how entities may change over time. Accordingly, insome embodiments, as one or more entity attributes change, one or moreof the changes may be recorded in a data store that supports time-seriesinformation (e.g., time-series data store). Also, in some embodiments,ingestion engines may be arranged to store relationship modifications,changes to strength of relationship scores, or the like, in atime-series data store.

Accordingly, in some embodiments, portfolio engines may be arranged toprovide organizations or users reports that include or highlight howchanges in entity attributes or relationships have changed over time.

Next, in one or more of the various embodiments, control may be returnedto a calling process.

FIG. 13 illustrates a flowchart for process 1300 for associatingthematic concepts and organizations in accordance with one or more ofthe various embodiments. After a start block, at block 1302, in one ormore of the various embodiments, query information may be provided toingestion engines. In one or more of the various embodiments, portfolioplatforms may be arranged to provide one or more user interfaces thatenable users to the query information. Also, in one or more of thevarious embodiments, portfolio engines may be arranged to enable one ormore automated services or process to submit query information via oneor more interfaces or APIs.

In some embodiments, query information may include discreet collectionsof concepts, themes, or the like, selected from user interface picklists, checkbox sets, radio button groups, or the like, or combinationthereof.

In one or more of the various embodiments, portfolio engines may bearranged to periodically/automatically run one or more pre-definedqueries. For example, for some embodiments, a pre-defined query lookingfor entities associated with solar power that may be located inparticular geographic areas, funded with government grants, andemploying 100 or more employees may be executed everyday to identify newentities to consider adding to a portfolio.

At block 1304, in one or more of the various embodiments, portfolioengines may be arranged to determine one or more concepts in the queryinformation

In one or more of the various embodiments, portfolio platforms may bearranged to enable query information that includes natural text.Accordingly, in some embodiments, specialized natural languageprocessing may be executed to determine the concepts included in thequery information.

In other embodiments, query information may be formatted such thatconcepts are readily determined based on the format of the queryinformation. For example, in some embodiments, query information may beformatted to include a list of comma separated words representingconcepts.

Also, in one or more of the various embodiments, portfolio engines maybe arranged to expand one or more concepts included in the queryinformation. In some embodiments, portfolio engines may be arranged toemploy concept graphs to identify one or more other concepts that may berelated to the one or more concepts included in the query information.

In some embodiments, portfolio engines may be arranged to locate theconcept nodes in the concept graph that may correspond to the conceptsin the query information. In some embodiments, portfolio engines may bearranged to traverse the concept graph starting at the query informationconcept nodes to search for one or more related concepts.

In one or more of the various embodiments, portfolio engines may bearranged to continue expanding the search for related concepts based onstrength of relationship scores associated with connected concept nodes.Also, in some embodiments, portfolio engines may be arranged to considerthe distance from the starting node of the search the determine if thesearch continue. For example, portfolio engines may be arranged tode-weight strength of relationship scores as the distance from thestarting nodes increases. In other embodiments, portfolio engines may bearranged to search for related concepts until a fixed maximum or minimumnumber of related concepts may be determined.

At block 1306, in one or more of the various embodiments, portfolioengines may be arranged to determine one or more candidate entitiesbased on a data graph and one or more portfolio models.

In one or more of the various embodiments, portfolio engines may bearranged to search data graphs for entities that may be related to theone or more concepts determined from the query information.

At block 1308, in one or more of the various embodiments, optionally,portfolio engines may be arranged to add one or more selected entitiesto one or more portfolios.

In one or more of the various embodiments, portfolio may be collectionsof entities that meet constraints or conditions outlined in the queryinformation. In some embodiments, portfolio engines may be arranged toautomatically create portfolios for queries such that entities thatconform to the query may be automatically added. Likewise, in someembodiments, portfolio engines may be arranged to automatically add oneor more entities to one or more portfolios that already exist.

Similarly, in some embodiments, portfolio engines may be arranged toautomatically remove one or more entities from one or more portfolios ifthey may fall out of compliance with query information (e.g.,conditions) associated with one or more portfolios.

Note, this block is marked optional because in some cases candidateentities may not be added to portfolios.

Next, in one or more of the various embodiments, control may be returnedto a calling process.

Illustrative Logical System Architecture for Modeling Conformance toThematic Concepts

FIG. 14 illustrates a logical schematic of a portion of system 1400 formodeling conformance to thematic concepts in accordance with one or moreof the various embodiments. In one or more of the various embodiments,portfolio platforms may be arranged to include various data structuresfor representing concepts, themes, and conformance to concepts orthemes.

In one or more of the various embodiments, themes may be considered acollection of one or more concepts. In some embodiments, themes mayinclude one or more concepts that may be related to each other based onrelationships determined from concept graphs. In some embodiments,portfolio engines may be arranged to automatically generate themes basedon concept graphs. For example, in some embodiments, concepts in conceptgraphs may be associated into themes based on clustering rules thatevaluate the relationships and strength of relationships among concepts.Thus, in some embodiments, some concepts may be automatically groupedinto themes. Also, in some embodiments, portfolio engines may bearranged to enable users to define themes based on manually selectingone or more concepts to include a theme. Further, in some embodiments,portfolio engines may be arranged to recommend one or more concepts thatmay be determined based on filters or searches. Accordingly, in someembodiments, portfolio engines may be arranged to provide one or moreuser interfaces that enable users to select concepts to associate withthemes.

In this example, for some embodiments, theme 1402 represents a datastructure that includes some concepts that may be considered part of atheme. In this example, theme 1402 may be considered a collection ofconcepts that may be related to green energy. In one or more of thevarious embodiments, themes may limit the number of concepts based onfilters based on the strength of relationship among or between theconcepts as determined from concept graphs. Also, in some embodiments,other considerations such as, distance (within the concept graph),number of co-relationships, or the like, may be applied to determine theconcepts to automatically include or recommend for inclusion into atheme.

In this example, for some embodiments, theme 1402, includes label 1404that names or describes the theme. Also, in some embodiments, label 1404may be a concept as well. In this example, label 1404 both describes thetheme and includes concept green energy. Also, in this example, concepts1406 represent the concepts that may be included in theme 1402. In thisexample, theme 1402 may be considered to be based on the portion ofconcept graph 600 shown in FIG. 6. However, in this example, theme 1402illustrates how a theme may exclude one or more otherwise relatedconcepts based on criteria described above or manual intervention. Insome embodiments, portfolio engines may be arranged to employ rules,filters, threshold values, instructions, or the like, provided viaconfiguration information to determine the criteria for selecting orrecommending concepts to be included themes.

Also, in one or more of the various embodiments, portfolio engines maybe arranged to employ one or more data structures, such as, scorecontainer 1408 to store information that describes how well anorganization or a portfolio of organizations correlate with a giventheme. In some embodiments, portfolio engines may be arranged to employscore containers to associate thematic information with organizations orportfolios (e.g., collections of organizations). Note, while one ofordinary skill in the art will appreciate that score containers may bearranged to represent thematic score information for organizations orportfolios, for brevity and clarity, score containers may be describedas corresponding to organizations even though some or all scorecontainers may correspond to portfolios comprised of more than oneorganization.

In some embodiments, score containers, such as, score container 1408 mayinclude fields for storing or referencing various information associatedwith modeling conformance to thematic concepts. In some embodiments, theparticular fields may vary depending on local requirements or localcircumstances. Thus, in some embodiments, portfolio engines may bearranged to employ configuration information to determine the exactfields to include score containers. In some embodiments, score containerfields may include: field 1410 for storing an identifier or reference tothe organization corresponding to the score container; field 1412 forstoring a label or description of the theme; field 1414 for storing athematic score that represents how well the organization matches thetheme; field 1416 for storing a trend value that represents if thethematic score is predicted to increase or decrease; field 1418 forstoring an organization score that represents how much theactivity/messaging (e.g., ingested information) directly associated withthe organization matches or is otherwise correlated with theme; field1420 for storing a score that represents how well other organizationsthat may be related to the organization may conform to the theme; field1422 for storing a score representing how activity/messaging of societyin general may support the theme; field 1424 for storing references to asamples of evidence that support the scores included in the scorecontainer; or the like.

In one or more of the various embodiments, organization scores, networkscores, general scores, or the like, may be combined to produce theoverall thematic score for the organization. Accordingly, in someembodiments, a thematic score may provide a discrete value thatquantifies or models the level of conformance to thematic concepts foran organization.

In one or more of the various embodiments, organization scores may begenerated based on evaluating the strength of relationships between therelevant concepts and the organization based on a data graph or otherknowledge graphs. Accordingly, in some embodiments, if a theme includesfive concepts, the organization score may represent how well theorganization conforms to all the concepts. For example, in someembodiments, an organization score may be a weighted average, or thelike, of partial scores for each individual concept in a theme.

In one or more of the various embodiments, network scores may representhow well related organizations may conform to the theme. Thus, in someembodiments, network scores may based on a combination of how thenetwork of organizations conform to the relevant concepts. In someembodiments, data graphs or entity graphs may be employed to identifyorganizations that may considered in the network of the organization ofinterest. Similar to how concepts may be automatically grouped intothemes, the network organizations may be determined based on one or morerules, instructions, or the like, that evaluate the strength ofrelationships between other organizations and the organization ofinterest.

In one or more of the various embodiments, general scores may representthe interest or prominence of the theme in the wider public or society.In one or more of the various embodiments, general scores may bedetermined based on ingesting information from government policydocuments, publications from thought leaders, social media trends,opinion pieces, or the like. For example, if a government agencyannounces a policy to use tax subsidies to increase the uptake of solarenergy, the general score associated with a theme that is associatedwith solar energy may increase.

In one or more of the various embodiments, portfolio engines may bearranged to employ one or more scoring models to compute information toinclude in score containers. In one or more of the various embodiments,scoring models may be tuned or modified to accommodate localrequirements or local circumstances. In some embodiments, portfolioengines may be arranged to employ feedback information to automaticallymodify scoring models. In some embodiments, feedback information mayinclude passive monitoring of user interaction with theme-basedportfolios. For example, in some embodiments, if users demonstrate apreference for the results generated by one scoring model over another,portfolio engines may be arranged to automatically de-ranked orotherwise modify existing scoring models.

Likewise, in some embodiments, if portfolio engines observe thatorganization scores, network scores, or general scores, generated by oneor more scoring models deviate or otherwise become inconsistent,portfolio engines may be arranged to modify the relevant scoring models.Further, in some embodiments, portfolio engines may be arranged toautomatically generate alternative scoring models that may be employedto generate alternate thematic scores that may be compared withdeployed/active scoring models. For example, in some embodiments, somescoring models may include various weights, scaling, threshold values,or the like, that may impact thematic scores. Accordingly, in someembodiments, portfolio engines may be arranged to automatically (e.g.,experimentally) modify these values and compare the results over timewith other scoring models.

In one or more of the various embodiments, portfolio engines may bearranged to set a defined range of values for thematic scores, such as,0-100, or the like. Alternatively, in some embodiments, thematic scoresor one or more partial scores may be bucketed into discrete values, suchas, NONE, LOW, MED, HIGH, or the like, with value ranges assigned toeach discrete value.

In one or more of the various embodiments, portfolio engines may bearranged to provide trend scores based on predicted or historicalincreases or decreases in scores. In some embodiments, portfolio enginesmay be arranged to employ predictive models (e.g., sub-models includedin or associated with scoring models) to generate the values for trendscores. In some embodiments, trend scores may be correlated with thecurrent or expected rate(s) of change of the corresponding scores.

In one or more of the various embodiments, portfolio engines may bearranged to provide one or more reference to evidence that may beassociated with the concepts used to generate thematic scores. In someembodiments, portfolio engines may be arranged to sample a portion ofthe information or documents collected by the ingestion engine andinclude references thereto to enable review by users. In someembodiments, portfolio engines may be arranged to randomly select adefined number of evidence references. Also, in some embodiments,portfolio engines may be arranged to apply rules, ranking, filters, orthe like, to select evidence references rather than picking themrandomly. In some embodiments, criteria for including or excludingevidence references may depend on various features of the evidence,including word counts or other observed characteristics of the evidence.In some embodiments, portfolio engines may be arranged to automaticallyexclude some evidence based on one or more characteristics or theevidence, including, source, subject matter, sensitivity, reliability,authenticity, format, or the like. In some embodiments, the criteria forincluding or excluding evidence references may vary depending on users,organizations, or the like. Thus, in one or more of the variousembodiments, portfolio engines may be arranged to determine evidencereferences based on rules, or the like, provided via configurationinformation to account for local circumstances or local requirements.

FIG. 15 illustrates a logical schematic of a portion of scoring system1500 for modeling conformance to thematic concepts in accordance withone or more of the various embodiments. As introduced above, in one ormore of the various embodiments, portfolio platforms may include scoringengines that may be arranged to generate thematic scores fororganizations or portfolios. In one or more of the various embodiments,scoring engines, such as, scoring engine 1502 may be arranged to employone or more scoring models, such as, scoring model 1508, to generatethematic scores, such as 1506. Accordingly, in some embodiments, scoringengines may be arranged to accept inputs, such as, inputs 1504 that mayinclude knowledge graphs, organizations, portfolios, concepts, themes,time series information, historical thematic scores, or the like, andevaluate them using one or more scoring models to generate thematicscores.

FIG. 16 illustrates a logical schematic of a portion of scoring system1600 for modeling conformance to thematic concepts in accordance withone or more of the various embodiments. As described above, portfolioplatforms may be arranged to include one or more scoring engines thatmay employ scoring models to generate thematic scores for organizationsor portfolios.

In one or more of the various embodiments, system 1600 includes datastructures that illustrate partial or intermediate results that may begenerated by scoring models. One of ordinary skill in the art willappreciate that the innovations described herein disclose features forimporting or configuring various scoring models via configurationinformation to account for local requirements or local circumstances.Accordingly, in this example, system 1600 may considered a simplifiednon-limiting example that illustrates how scoring engines may generatethematic scores.

In this example, for some embodiments, scoring engines may be arrangedto generate one or more data structures, such as, data structure 1602that may be store some or all of the partial results that may contributethe generation of thematic scores. In this example, data structure 1602,includes several columns, including; column 1604 for storing a conceptidentifier; column 1606 for storing an organization score for eachconcept in the theme; column 1608 for storing and organization scoreweight for each concept in the theme; column 1610 for storing a networkscore for each concept in the theme; column 1612 for storing generalscores for each concept in the theme; column 1614 for storing asub-score for each concept in the theme that is a combination of theorganization score, network score, or general score of each concept;column 1616 for storing a concept weight that represent the influence ofeach concept to the overall theme score; column 1618 for storing partialscores that represent how much each concept contributes to the themescore; or the like. Also, in this example, records 1620 may include arecord for each concept in the theme being considered. Note, for clarityand brevity description of some columns have been omitted because theirmeaning is apparent based on descriptions of similar columns.

Also, in this example, cell 1604 represents the thematic score for theorganization under consideration. Similarly, in this example: the themeorganization score may be computed based on averaging the conceptorganization scores included in column 1606; the theme network score maybe computed based on averaging the concept network scores included incolumn 1610; the theme general score may be computed based on averagingthe concept general scores included in column 1612; or the like.

In one or more of the various embodiments, scores (e.g., organization,network, general, or the like) may be generated for each concept basedon ingested evidence. In one or more of the various embodiments,different sources or types of ingested evidence may carry differentweight. Likewise, in some embodiments, different evidence may beevaluated differently. For example, some types of evidence may be scoredusing various metrics, such as, word counts, topic evaluation, recency,or the like. Accordingly, in some embodiments, scoring models may bearranged to define rules or instructions for evaluating ingestedinformation while generating thematic scores.

In this example, data structure 1622 may be considered a simplifiednon-limiting example for computing an organization score for the conceptClean Energy. In this example, data structure 1622 includes: column 1624for storing the source/type indicator of ingested evidence; column 1626for storing a mention metric that reflects how many time concept relatedcontent was discovered in the given evidence; column 1628 for storing aweight value for the different sources/types or evidence; column 1630for storing a partial score based on each source/type of evidence. Also,in this example, records 1632 may be considered to include records forthe different evidence that may have been considered.

Note, in some embodiments, scoring models may be arbitrarily complexdepending on the needs or resources available to a portfolio platform.For example, other scoring criteria may be based on traversing one ormore knowledge graphs to determine partial scores that may be combinedinto overall thematic scores. Accordingly, in some embodiments, one ofordinary skill in the art will appreciate the scoring models used inproduction systems may include more or fewer data structures, columns,records, or the like, than illustrated here. However, one of ordinaryskill in the art will appreciate the description herein is at leastsufficient for disclosing the innovation described herein.

FIG. 17 illustrates a logical representation of a portion of userinterface 1700 for modeling conformance to thematic concepts inaccordance with one or more of the various embodiments. In someembodiments, user interface 1700 may be arranged to include one or morepanels, such as, portfolio panel 1702, thematic score panel 1704, or thelike.

In one or more of the various embodiments, user interface 1700 may bedisplayed on one or more hardware displays, such as, client computerdisplays, mobile device displays, or the like. In some embodiments, userinterface 1700 may be provided via a native application or as a webapplication hosted in a web browser or other similar applications. Oneof ordinary skill in the art will appreciate that for at least clarityor brevity many details common to commercial/production user interfaceshave been omitted from user interface 1700. Likewise, in someembodiments, user interfaces may be arranged differently than showndepending on local circumstances or local requirements. However, one ofordinary skill in the art will appreciate that thedisclosure/description of user interface 1700 is at least sufficient fordisclosing the innovations included herein.

In this example, portfolio panel 1702 may be employed to displayportions of a portfolio that includes various organizations. In thisexample, the listed organizations represent organizations that may beincluded in a portfolio that a user may be reviewing. In someembodiments, the particular portfolio shown in portfolio panel may bedetermined in various ways depending on the user interface. For example,a user interface may include one or more user interface controls thatenable users to provide query information that may be employed to searchfor one or more portfolios.

In this example, thematic score panel 1704 is employed to display one ormore portions of a visualization or display thematic score informationthat may be associated with portfolio item 1706 (e.g., Organization B).In some embodiments, thematic score information may be displayed usingvisualizations, reports, or the like. Likewise, in some embodiments,thematic score information for more than one organization or more thanone portfolios may be displayed in various visualizations, reports, orthe like, for display to users. One of ordinary skill in the art willappreciate the other user interfaces, visualizations, or the like, maybe employed without departing from the scope of the innovationsdisclosed herein.

Generalized Operations for Data Ingestion

FIGS. 18-20 represent generalized operations for modeling conformance tothematic concepts in accordance with one or more of the variousembodiments. In one or more of the various embodiments, processes 1800,1900, and 2000 described in conjunction with FIGS. 18-20 may beimplemented by or executed by one or more processors on a single networkcomputer, such as network computer 300 of FIG. 3. In other embodiments,these processes, or portions thereof, may be implemented by or executedon a plurality of network computers, such as network computer 300 ofFIG. 3. In yet other embodiments, these processes, or portions thereof,may be implemented by or executed on one or more virtualized computers,such as, those in a cloud-based environment. However, embodiments arenot so limited and various combinations of network computers, clientcomputers, or the like may be utilized. Further, in one or more of thevarious embodiments, the processes described in conjunction with FIGS.18-20 may perform actions for modeling conformance to thematic conceptsin accordance with at least one of the various embodiments,architectures, or processes, such as those described in conjunction withFIGS. 4-17. Further, in one or more of the various embodiments, some orall of the actions performed by processes 1800, 1900, and 20000 may beexecuted in part by ingestion engine 322, portfolio engine 324, indexingengine 326, scoring engine 327, or the like.

FIG. 18 illustrates an overview flowchart for process 1800 for modelingconformance to thematic concepts in accordance with one or more of thevarious embodiments. After a start block, at block 1802, in one or moreof the various embodiments, portfolio engines may be arranged todetermine one or more themes. As described above, in some embodiments,themes may represent one or more concepts that may be grouped orassociated with each based on various criteria.

In one or more of the various embodiments, portfolio engines may bearranged to provide one or more user interfaces that enable users toselect one or more themes of interest. Also, in some embodiments,portfolio engines may be arranged to automatically select one or morethemes based on one or more of rules, triggers, alarms, or the like,that may be determined based on configuration information. In someembodiments, portfolio engines may be arranged to automatically selectone or more themes based on one or more of its associated conceptsrising in prominence as determined by one or more metrics associatedwith ingested information.

At block 1804, in one or more of the various embodiments, optionally,portfolio engines may be arranged to determine one or more portfolios.In one or more of the various embodiments, portfolio engines may bearranged to determine one or more portfolios that may be associated withone or more organizations. In some embodiments, portfolio engines may bearranged to provide one or more user interfaces that enable users toselect one or more portfolios. Also, in some embodiments, portfolioengines may be arranged to automatically select one or more portfoliosbased on one or more of rules, triggers, alarms, or the like, that maybe determined based on configuration information.

Note, this block is indicated as being optional because in some casesfor some embodiments organizations may be processed individually ratherthan as part of a portfolio.

At block 1806, in one or more of the various embodiments, portfolioengines may be arranged to determine one or more organizations. In someembodiments, portfolio engines may be arranged to provide one or moreuser interfaces that enable users to select one or more organizations.Also, in some embodiments, portfolio engines may be arranged toautomatically select one or more organizations based on one or more ofrules, triggers, alarms, or the like, that may be determined based onconfiguration information.

At block 1808, in one or more of the various embodiments, portfolioengines may be arranged to generate thematic scores for the one or moreorganizations based on one or more thematic scoring models. In one ormore of the various embodiments, portfolio platforms may be arranged toemploy one or more scoring engines that may be arranged to employ one ormore scoring models to generate thematic scores for the one or moreorganizations.

As described herein, scoring models may be arranged to employ conceptgraphs, data graphs, ingested information, user feedback, or the like,to determine thematic scores for organizations.

At block 1810, in one or more of the various embodiments, optionally,portfolio engines may be arranged to determine thematic scores for theone or more portfolios. In one or more of the various embodiments, if aportfolio may be under evaluation, scoring engines may be arranged togenerate a thematic score for an entire portfolio. Accordingly, in someembodiments, scoring engines may be arranged to execute one or moreactions declared in scoring models to combine the thematic scores oforganizations into a thematic score for the entire portfolio. Forexample, in some embodiments, portfolio engines may be arranged togenerate portfolio thematic scores based on averaging the thematicscores of the organizations associated with the portfolio. Likewise, forexample, portfolio engines may be arranged to generate portfoliothematic score containers that report the high thematic score, lowthematic score, and the average thematic score for the organizations inthe portfolio.

Note, this block is indicated as being optional because in some casesfor some embodiments, thematic scores may be generated for organizationsmay be processed individually rather than as part of a portfolio.

FIG. 19 illustrates a flowchart for process 1900 for determining relatedconcepts that may be associated with a theme for modeling conformance tothematic concepts in accordance with one or more of the variousembodiments. After a start block, at block 1902, in one or more of thevarious embodiments, portfolio engines may be arranged to provide one ormore theme concepts.

In one or more of the various embodiments, theme concepts may beconsidered an anchor concept that a theme may be built around. In someembodiments, users or account holder may select a concept that mayrepresent a theme they would like to explore. Accordingly, in someembodiments, the theme concept may be considered a concept that may beemployed to discover or generate a theme that include one or morerelated other concepts that may be related to the provided themeconcept.

Also, in one or more of the various embodiments, portfolio engines maybe arranged to provide one or more user interfaces that enable users tomanually select one or more concepts to include in a theme.

At block 1904, in one or more of the various embodiments, portfolioengines may be arranged to traverse one or more concept graphs todetermine one or more related concepts.

As described above, themes may include or associate one or more conceptsthat may be related. In some embodiments, as described above, portfolioengines may be arranged to employ concept graphs to evaluate if one ormore concepts may be related. Accordingly, in some embodiments,portfolio engines may be arranged to employ concept graphs to determineif one or more concepts should be associated with a theme. In one ormore of the various embodiments, portfolio engines may be arranged totraverse concept graphs to determine one or more concepts that may berelated to the theme concept. In some embodiments, portfolio engines maybe arranged to evaluate various metrics, such as, number ofrelationships to other concepts, strength of relationships to otherconcepts, or the like, to determine if one or more concepts may beconsidered or recommended of consideration to include or associate witha theme or theme concept. Accordingly, in some embodiments, portfolioengines may be arranged to determine the rules, threshold values (e.g.,minimum strength of relationship, or the like), criteria, or the like,based on configuration information to account for local circumstances orlocal requirements.

Note, searching for related concepts is described as traversing conceptgraphs, one of ordinary skill in the art will appreciate that theimplementation of the data structures representing concept graph are notlimited to conventional graphs with explicit nodes or edges. Forexample, in some embodiments, concept graphs may be implemented usingcontiguous memory, optimized indexes, caches, or the like, such that atraversal of the concept graph may include employing index pointers,counters, loops, or the like, that provide a logical traversal of theconcept graphs.

At decision block 1906, in one or more of the various embodiments, if aconcept visited during the traversal of the one or more concept graphsmay be determined to be a related concept, control may flow to block1908; otherwise, control may flow to decision block 1910.

At block 1908, in one or more of the various embodiments, portfolioengines may be arranged to include the visited concept in the theme. Inone or more of the various embodiments, portfolio engines may bearranged to automatically include or associated the related concept intothe theme. Also, in one or more of the various embodiments, portfolioengines may be arranged to determine the one or more related conceptsand enable users to manually select or confirm if one or more of thecandidate related concepts may be included in a theme.

At decision block 1910, in one or more of the various embodiments, ifthe traversal continues, control may loop back to block 1904; otherwise,control may be returned to a calling process.

Next, in one or more of the various embodiments, control may be returnedto a calling process.

FIG. 20 illustrates a flowchart for process 2000 for modelingconformance to thematic concepts in accordance with one or more of thevarious embodiments. After a start block, at block 2002, in one or moreof the various embodiments, portfolio engines may be arranged to provideone or more organizations. As described above, in some embodiments,portfolio engines may be arranged to select or determine one or moreorganizations using a variety of mechanism. In some embodiments,portfolio engines may be arranged to include one or more user interfacesthat provide search tools, filters, query engines, or the like, thatusers may employ to search for organizations that meet one or morecriteria. Also, in some embodiments, portfolio engines may be arrangedto select or determine one or more organizations from portfolios thatinclude or associate one or more organizations. One of ordinary skill inthe art will appreciate the that other mechanism may be employed toselect or determine organizations to provide to scoring engines fordetermining thematic scores. For example, in some embodiments, one ormore organizations or portfolio may be stored in favorite lists, savedlists, associated with user accounts, or the like. Further, in someembodiments, portfolio engines may be arranged to enable filters,conditions, triggers, or the like, that may automatically selectorganizations to provide to scoring engines based on one or morecharacteristics of the organizations or other metrics that may beobserved in ingested information or from other sources.

At block 2004, in one or more of the various embodiments, portfolioengines may be arranged to provide a theme that may be associated withone or more concepts. Similar to how portfolio platforms may employ orprovide be a variety of mechanism for select or determine organizations,portfolio platforms may be arranged to employ a variety of mechanism toselect or determine themes (e.g., collections of one or more concepts)that may be provided. For example, in some embodiments, portfolioengines may be arranged to enable users or account holders to declareone or more theme that they have interest in. Likewise, in someembodiments, portfolio engines may be arranged to automatically selector determine one or more themes based on ingested information orobserved metrics. For example, in one or more of the variousembodiments, if a theme includes one or more concepts that rise inprominence, related themes may be selected or determined. For example,in some embodiments, if the number of mentions of one or more conceptsin ingested information exceeds a threshold value, portfolio engines maybe arranged to provide one or more of the related themes.

Accordingly, in one or more of the various embodiments, portfolioengines may be arranged to provide each theme of interest to a scoringengine to generate thematic scores based on the provided themes.

At block 2006, in one or more of the various embodiments, portfolioengines may be arranged to provide one or more concepts based on thetheme. As described above, these may be associated one or more concepts.Accordingly, in some embodiments, scoring engines may be arranged toevaluate each associated concept to generate one or more partial scoresthat may be employed for generating thematic scores.

At block 2008, in one or more of the various embodiments, optionally,portfolio engines may be arranged to determine one or more scoringmodels. As described above, scoring models may be data structures thatencapsulate rules, instructions, classifiers, sub-models, or the like,for computing thematic scores.

In one or more of the various embodiments, portfolio engines may bearranged to automatically determine scoring models based on variousfactors, including, users, account holders, organizations, geographiclocation, or the like. Similarly, in some embodiments, portfolio enginesmay be arranged to select scoring models based on one or more indirectmetrics, such as, risk level, precision, accuracy, or the like. Forexample, in some embodiments, some scoring models may employrules/criteria that may be more or less liberal than others.

Likewise, in some embodiments, portfolio engines may be arranged toexecute experimental evaluations that employ new or experimental scoringmodels to compare results with other scoring models.

Note, this block is indicated as being optional because in someembodiments scoring models may be determined previously rather that atblock 2008.

At block 2010, in one or more of the various embodiments, portfolioengines may be arranged to employ the one or more scoring models todetermine thematic scores for the one or more organizations and one ormore concepts. As described above, in some embodiments, scoring enginesmay be arranged to generate various partial scores or partial resultsfor each pair of concepts or organizations.

In one or more of the various embodiments, scoring engines may bearranged to generate an organization score that evaluates the directrelationships the organizations may have with the concept or its relatedconcepts. Accordingly, in some embodiments, one or more knowledge graphsor data graphs described above may be employed to determine how conceptsrelated to organizations.

In one or more of the various embodiments, scoring engines may bearranged to generate one or more partial scores, such as, organizationscores, network scores, general scores, or the like. Also, in someembodiments, scoring models may be arranged to generate one or moreadditional or supplemental scores or metrics, including, trend values,predicted values, or the like.

In one or more of the various embodiments, scoring engines may bearranged to generate organization scores that represent and quantify therelationship or association between the organization of interest and theconcept of interest. In one or more of the various embodiments, thesemantic meaning of the organization score may vary depending on thescoring models being employed. However, in one or more of the variousembodiments, organization scores may be considered to represent howimportant or prominent the concept of interest may be to theorganization.

Also, as described above, scoring engines may be arranged to generateother partial scores, such as, network score for evaluating theprominence of the concept of interest in other organizations that may beconsidered in the network of the organization of interest. In someembodiments, portfolio engines may be arranged to determine theorganizations that may be in the same network based on the data graph.As described above, portfolio engines may be arranged to determine anetwork of organizations based on one or more dimensions ofrelationships that may captured in one or more knowledge graphs.

In one or more of the various embodiments, scoring engines may bearranged traverse one or more concept graphs, entity graphs, datagraphs, or the like, to evaluate the prominence or strength of therelationships between concepts and organizations. Note, evaluatingrelationships between concepts and organizations may be described astraversing one or more knowledge graphs. However, one of ordinary skillin the art will appreciate that the implementation of the datastructures representing knowledge graphs are not limited to conventionalgraphs with explicit nodes or edges. For example, in some embodiments,knowledge graphs may be implemented using contiguous memory, optimizedindexes, caches, or the like, such that a traversal of the concept graphmay include employing index pointers, counters, loops, or the like, thatprovide a logical traversal of the knowledge graphs of interest.

Also, in some embodiments, scoring engines may be arranged to determineone or more of the range of the scores values, weights, scaling,leveling, bucketing, or the like, based on the scoring models beingemployed.

At decision block 2012, in one or more of the various embodiments, ifthere may be more concepts in the theme, control may loop back to block2006; otherwise, control may flow to block 2014. As described above, insome embodiments, one or more concepts may be grouped or otherwiseassociated into a theme. Accordingly, in some embodiments, scoringengines may be arranged to generate thematic scores byscoring/evaluating each concept that may be associated with a theme.Note, in some embodiments, individual concepts may be considered absentan association with a theme. However, in some embodiments, in the caseof generating a thematic scores for a single concept, the single conceptmay be considered a theme.

At block 2014, in one or more of the various embodiments, portfolioengines may be arranged to employ the one or more scoring models todetermine an overall theme score for the one or more organizations andthe one or more concepts.

As described above, scoring engines may be arranged to employ scoringmodels to generate one or more partial scores that represent differentaspects of the relationship between an organization and one or moreconcepts. Accordingly, in some embodiments, scoring engines may bearranged to employ the one or more scoring models to determine how tocombine the partial scores to provide the values that may be stored in ascore container. For example, in some embodiments, organization scoresmay be weighted higher than general scores, or the like. Likewise, somescoring models may be arranged to weight one or more conceptsdifferently. Accordingly, in some embodiments, scoring engines may bearranged to employ the one or more scoring models to determine theactions or rules for combining the partial scores generated for eachconcept into thematic scores.

Next, in one or more of the various embodiments, control may be returnedto a calling process.

It will be understood that each block in each flowchart illustration,and combinations of blocks in each flowchart illustration, can beimplemented by computer program instructions. These program instructionsmay be provided to a processor to produce a machine, such that theinstructions, which execute on the processor, create means forimplementing the actions specified in each flowchart block or blocks.The computer program instructions may be executed by a processor tocause a series of operational steps to be performed by the processor toproduce a computer-implemented process such that the instructions, whichexecute on the processor, provide steps for implementing the actionsspecified in each flowchart block or blocks. The computer programinstructions may also cause at least some of the operational steps shownin the blocks of each flowchart to be performed in parallel. Moreover,some of the steps may also be performed across more than one processor,such as might arise in a multi-processor computer system. In addition,one or more blocks or combinations of blocks in each flowchartillustration may also be performed concurrently with other blocks orcombinations of blocks, or even in a different sequence than illustratedwithout departing from the scope or spirit of the invention.

Accordingly, each block in each flowchart illustration supportscombinations of means for performing the specified actions, combinationsof steps for performing the specified actions and program instructionmeans for performing the specified actions. It will also be understoodthat each block in each flowchart illustration, and combinations ofblocks in each flowchart illustration, can be implemented by specialpurpose hardware-based systems, which perform the specified actions orsteps, or combinations of special purpose hardware and computerinstructions. The foregoing example should not be construed as limitingor exhaustive, but rather, an illustrative use case to show animplementation of at least one of the various embodiments of theinvention.

Further, in one or more embodiments (not shown in the figures), thelogic in the illustrative flowcharts may be executed using an embeddedlogic hardware device instead of a CPU, such as, an Application SpecificIntegrated Circuit (ASIC), Field Programmable Gate Array (FPGA),Programmable Array Logic (PAL), or the like, or combination thereof. Theembedded logic hardware device may directly execute its embedded logicto perform actions. In one or more embodiments, a microcontroller may bearranged to directly execute its own embedded logic to perform actionsand access its own internal memory and its own external Input and OutputInterfaces (e.g., hardware pins or wireless transceivers) to performactions, such as System On a Chip (SOC), or the like.

What is claimed as new and desired to be protected by Letters Patent of the United States is:
 1. A method for managing data using one or more network computers, wherein one or more processors execute instructions to perform actions, comprising: providing a data graph based on one or more knowledge graphs and information provided by one or more data sources; providing one or more concepts and one or more entities based on the data graph; determining one or more scoring models based on the one or more concepts and the one or more entities; providing one or more themes that refer to one or more of a concept that encompasses one or more other concepts, another theme that encompasses one or more concepts that are associated with two or more themes, or a compound theme that includes one or more sub-themes; generating one or more thematic scores for the one or more entities based on the one or more themes and the one or more scoring models and the data graph that are associated with the at least portion of the one or more concepts, wherein the one or more thematic scores include one or more values that quantify each relationship between the one or more concepts and the one or more entities, and wherein an entity with a higher thematic score value for a concept has a relationship strength value that exceeds another relationship strength value for another entity with a lower thematic score value for the concept; providing one or more user interfaces having tools to search for one or more organizations based on selection of at least one theme that is associated with one or more concepts having a number of mentions above a threshold value; and providing a report that includes the one or more thematic scores, the one or more entities, and the one or more concepts.
 2. The method of claim 1, wherein generating the one or more thematic scores, further comprises, traversing the data graph to determine the one or more entities that are related to the one or more concepts based on one or more relationships represented in the data graph, wherein the one or more thematic scores are based on the strength of relationship between each entity and each concept.
 3. The method of claim 1, wherein providing the one or more entities, further comprises, providing one or more portfolios that include the one or more entities, wherein each included entity contributes one or more partial thematic scores based on the one or more scoring models.
 4. The method of claim 1, wherein providing the one or more concepts, further comprises: providing a theme based on an anchor concept; and traversing a concept graph to determine the one or more concepts based on one or more relationships between the one or more concepts and the anchor concept, wherein each relationship between the anchor concept and the one or more concepts is associated with each relationship strength value that exceeds a threshold value.
 5. The method of claim 1, wherein providing the report that includes the one or more thematic scores, further comprises, displaying a value that quantifies a strength of each relationship between the one or more entities and the one or more concepts.
 6. The method of claim 1, wherein providing the one or more thematic scores, further comprises: employing the one or more scoring models to provide one or more sub-scores for each thematic score based on a weighting that corresponds to a contribution of the one or more sub-scores to the one or more thematic scores; and generating one or more score containers that each include one or more values for the one or more weighted sub-scores that correspond to the one or more entities.
 7. A system for managing data over a network, comprising: a network computer, comprising: a memory that stores at least instructions; and one or more processors that execute instructions that perform actions, including: providing a data graph based on one or more knowledge graphs and information provided by one or more data sources; providing one or more concepts and one or more entities based on the data graph; determining one or more scoring models based on the one or more concepts and the one or more entities; providing one or more themes that refer to one or more of a concept that encompasses one or more other concepts, another theme that encompasses one or more concepts that are associated with two or more themes, or a compound theme that includes one or more sub-themes; generating one or more thematic scores for the one or more entities based on the one or more themes and the one or more scoring models and the data graph that are associated with the at least portion of the one or more concepts, wherein the one or more thematic scores include one or more values that quantify each relationship between the one or more concepts and the one or more entities, and wherein an entity with a higher thematic score value for a concept has a relationship strength value that exceeds another relationship strength value for another entity with a lower thematic score value for the concept; providing one or more user interfaces having tools to search for one or more organizations based on selection of at least one theme that is associated with one or more concepts having a number of mentions above a threshold value; and providing a report that includes the one or more thematic scores, the one or more entities, and the one or more concepts; and a client computer, comprising: another memory that stores at least instructions; and one or more other processors that other execute instructions that perform actions, including: displaying the report.
 8. The system of claim 7, wherein generating the one or more thematic scores, further comprises, traversing the data graph to determine the one or more entities that are related to the one or more concepts based on one or more relationships represented in the data graph, wherein the one or more thematic scores are based on the strength of relationship between each entity and each concept.
 9. The system of claim 7, wherein providing the one or more entities, further comprises, providing one or more portfolios that include the one or more entities, wherein each included entity contributes one or more partial thematic scores based on the one or more scoring models.
 10. The system of claim 7, wherein providing the one or more concepts, further comprises: providing a theme based on an anchor concept; and traversing a concept graph to determine the one or more concepts based on one or more relationships between the one or more concepts and the anchor concept, wherein each relationship between the anchor concept and the one or more concepts is associated with each relationship strength value that exceeds a threshold value.
 11. The system of claim 7, wherein providing the report that includes the one or more thematic scores, further comprises, displaying a value that quantifies a strength of each relationship between the one or more entities and the one or more concepts.
 12. The system of claim 7, wherein providing the one or more thematic scores, further comprises: employing the one or more scoring models to provide one or more sub-scores for each thematic score based on a weighting that corresponds to a contribution of the one or more sub-scores to the one or more thematic scores; and generating one or more score containers that each include one or more values for the one or more weighted sub-scores that correspond to the one or more entities.
 13. A network computer for managing data, comprising: a memory that stores at least instructions; and one or more processors that execute instructions that perform actions, including: providing a data graph based on one or more knowledge graphs and information provided by one or more data sources; providing one or more concepts and one or more entities based on the data graph; determining one or more scoring models based on the one or more concepts and the one or more entities; providing one or more themes that refer to one or more of a concept that encompasses one or more other concepts, another theme that encompasses one or more concepts that are associated with two or more themes, or a compound theme that includes one or more sub-themes; generating one or more thematic scores for the one or more entities based on the one or more themes and the one or more scoring models and the data graph that are associated with the at least portion of the one or more concepts, wherein the one or more thematic scores include one or more values that quantify each relationship between the one or more concepts and the one or more entities, and wherein an entity with a higher thematic score value for a concept has a relationship strength value that exceeds another relationship strength value for another entity with a lower thematic score value for the concept; providing one or more user interfaces having tools to search for one or more organizations based on selection of at least one theme that is associated with one or more concepts having a number of mentions above a threshold value; and providing a report that includes the one or more thematic scores, the one or more entities, and the one or more concepts.
 14. The network computer of claim 13, wherein generating the one or more thematic scores, further comprises, traversing the data graph to determine the one or more entities that are related to the one or more concepts based on one or more relationships represented in the data graph, wherein the one or more thematic scores are based on the strength of relationship between each entity and each concept.
 15. The network computer of claim 13, wherein providing the one or more entities, further comprises, providing one or more portfolios that include the one or more entities, wherein each included entity contributes one or more partial thematic scores based on the one or more scoring models.
 16. The network computer of claim 13, wherein providing the one or more concepts, further comprises: providing a theme based on an anchor concept; and traversing a concept graph to determine the one or more concepts based on one or more relationships between the one or more concepts and the anchor concept, wherein each relationship between the anchor concept and the one or more concepts is associated with each relationship strength value that exceeds a threshold value.
 17. The network computer of claim 13, wherein providing the report that includes the one or more thematic scores, further comprises, displaying a value that quantifies a strength of each relationship between the one or more entities and the one or more concepts.
 18. The network computer of claim 13, wherein providing the one or more thematic scores, further comprises: employing the one or more scoring models to provide one or more sub-scores for each thematic score based on a weighting that corresponds to a contribution of the one or more sub-scores to the one or more thematic scores; and generating one or more score containers that each include one or more values for the one or more weighted sub-scores that correspond to the one or more entities.
 19. A processor readable non-transitory storage media that includes instructions for managing data, wherein execution of the instructions by one or more hardware processors performs actions, comprising: providing a data graph based on one or more knowledge graphs and information provided by one or more data sources; providing one or more concepts and one or more entities based on the data graph; determining one or more scoring models based on the one or more concepts and the one or more entities; providing one or more themes that refer to one or more of a concept that encompasses one or more other concepts, another theme that encompasses one or more concepts that are associated with two or more themes, or a compound theme that includes one or more sub-themes; generating one or more thematic scores for the one or more entities based on the one or more themes and the one or more scoring models and the data graph that are associated with the at least portion of the one or more concepts, wherein the one or more thematic scores include one or more values that quantify each relationship between the one or more concepts and the one or more entities, and wherein an entity with a higher thematic score value for a concept has a relationship strength value that exceeds another relationship strength value for another entity with a lower thematic score value for the concept; providing one or more user interfaces having tools to search for one or more organizations based on selection of at least one theme that is associated with one or more concepts having a number of mentions above a threshold value; and providing a report that includes the one or more thematic scores, the one or more entities, and the one or more concepts.
 20. The media of claim 19, wherein generating the one or more thematic scores, further comprises, traversing the data graph to determine the one or more entities that are related to the one or more concepts based on one or more relationships represented in the data graph, wherein the one or more thematic scores are based on the strength of relationship between each entity and each concept.
 21. The media of claim 19, wherein providing the one or more entities, further comprises, providing one or more portfolios that include the one or more entities, wherein each included entity contributes one or more partial thematic scores based on the one or more scoring models.
 22. The media of claim 19, wherein providing the one or more concepts, further comprises: providing a theme based on an anchor concept; and traversing a concept graph to determine the one or more concepts based on one or more relationships between the one or more concepts and the anchor concept, wherein each relationship between the anchor concept and the one or more concepts is associated with each relationship strength value that exceeds a threshold value.
 23. The media of claim 19, wherein providing the report that includes the one or more thematic scores, further comprises, displaying a value that quantifies a strength of each relationship between the one or more entities and the one or more concepts.
 24. The media of claim 19, wherein providing the one or more thematic scores, further comprises: employing the one or more scoring models to provide one or more sub-scores for each thematic score based on a weighting that corresponds to a contribution of the one or more sub-scores to the one or more thematic scores; and generating one or more score containers that each include one or more values for the one or more weighted sub-scores that correspond to the one or more entities. 